1. 10 9月, 2012 5 次提交
  2. 30 4月, 2012 1 次提交
    • A
      powerpc: Use WARN instead of dump_stack when printing EEH error backtrace · 14fb1fa6
      Anton Blanchard 提交于
      When we get an EEH error we just print a backtrace with dump_stack
      which is rather cryptic. We really should print something before
      spewing out the backtrace.
      
      Also switch from dump_stack to WARN so we get more information about
      the fail - what modules were loaded, what process was running etc.
      This was useful information when debugging a recent EEH subsystem bug.
      
      The standard WARN output should also get picked up by monitoring
      tools like kerneloops.
      
      The register dump is of questionable value here but I figured it was
      better to use something standard and not roll my own.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      14fb1fa6
  3. 23 4月, 2012 1 次提交
    • G
      powerpc/eeh: Fix crash caused by null eeh_dev · 2ef822c5
      Gavin Shan 提交于
      The problem was reported by Anton Blanchard. While EEH error
      happened to the PCI device without the corresponding device
      driver, kernel crash was seen. Eventually, I successfully
      reproduced the problem on Firebird-L machine with utility
      "errinjct". Initially, the device driver for Emulex ethernet
      MAC has been disabled from .config and force data parity on
      the Emulex ethernet MAC with help of "errinjct". Eventually,
      I saw the kernel crash after issueing couple of "lspci -v"
      command.
      
      The root cause behind is that the PCI device, including the
      reference to the corresponding eeh device, will be removed
      from the system while EEH does recovery. Afterwards, the
      PCI device will be probed again and added into the system
      accordingly. So it's not safe to retrieve the eeh device from
      the corresponding PCI device after the PCI device has been removed
      and not added again.
      
      The patch fixes the issue and retrieve the eeh device from OF node
      instead of PCI device after the PCI device has been removed.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Tested-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2ef822c5
  4. 28 3月, 2012 1 次提交
    • G
      powerpc/eeh: Retrieve PHB from global list · 1a5c2e63
      Gavin Shan 提交于
      Currently, the existing PHBs are retrieved from the FDT (Flat
      Device Tree) based on the name of FDT node. Specificly, those
      FDT nodes whose names have prefix "pci" are regarded as PHBs.
      That's inappropriate because some PCI bridges possibilly have
      names leading with "pci". It caused EEH is enabled on same
      PCI devices for towice.
      
      The patch fixes the above issue. Besides, the PHBs are expected
      to be figured out from FDT before enable EEH on them. Therefore,
      it's resonable to retrieve the PHBs from the global linked list
      traced by variable "hose_list" insteading poking them from FDT.
      
      For the EEH implementation on pSeries platform, RTAS is critical
      because all low-level functions are implemented based on RTAS.
      Therefore, we should make sure "/rtas" OF node is available and
      ready before to enable EEH core. However, it actually introduced
      duplicate since the previous pSeries platform dependent initialization
      function already do the check. Besides, we want to make eeh core
      platform independent, so RTAS related staff should be removed there.
      The patch removes the duplicate check on "/rtas" OF node for eeh
      core.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1a5c2e63
  5. 09 3月, 2012 16 次提交
    • G
      powerpc/eeh: pseries platform config space access in EEH · 3780444c
      Gavin Shan 提交于
      With the original EEH implementation, the access to config space of
      the corresponding PCI device is done by RTAS sensitive function. That
      depends on pci_dn heavily. That would limit EEH extension to other
      platforms like powernv because other platforms might have different
      ways to access PCI config space.
      
      The patch splits those functions used to access PCI config space
      and implement them in platform related EEH component. It would be
      helpful to support EEH on multiple platforms simutaneously in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3780444c
    • G
      powerpc/eeh: Introduce struct eeh_stats for EEH · e575f8db
      Gavin Shan 提交于
      With the original EEH implementation, the EEH global statistics
      are maintained by individual global variables. That makes the
      code a little hard to maintain.
      
      The patch introduces extra struct eeh_stats for the EEH global
      statistics so that it can be maintained in collective fashion.
      
      It's the rework on the corresponding v5 patch. According to
      the comments from David Laight, the EEH global statistics have
      been changed for a litte bit so that they have fixed-type of
      "u64". Also, the format used to print them has been changed to
      "%llu" based on David's suggestion. Also, the output format of
      EEH global statistics should be kept as intacted according to
      Michael's suggestion that there might be tools parsing them.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e575f8db
    • G
      powerpc/eeh: Replace pci_dn with eeh_dev for EEH aux components · 40a7cd92
      Gavin Shan 提交于
      The original EEH implementation is heavily depending on struct pci_dn.
      We have to put EEH related information to pci_dn. Actually, we could
      split struct pci_dn so that the EEH sensitive information to form an
      individual struct, then EEH looks more independent.
      
      The patch replaces pci_dn with eeh_dev for EEH aux components like
      event and driver. Also, the eeh_event struct has been adjusted for
      a little bit since eeh_dev has linked the associated FDT (Flat Device
      Tree) node and PCI device. It's not necessary for eeh_event struct to
      trace FDT node and PCI device. We can just simply to trace eeh_dev in
      eeh_event.
      
      The patch also renames function pcid_name() to eeh_pcid_name(), which
      should be missed in the previous patch where the EEH aux components
      have been cleaned up.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      40a7cd92
    • G
      powerpc/eeh: Replace pci_dn with eeh_dev for EEH core · f631acd3
      Gavin Shan 提交于
      The original EEH implementation is heavily depending on struct pci_dn.
      We have to put EEH related information to pci_dn. Actually, we could
      split struct pci_dn so that the EEH sensitive information to form an
      individual struct, then EEH looks more independent.
      
      The patch replaces pci_dn with eeh_dev for EEH core.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f631acd3
    • G
      powerpc/eeh: Cleanup function names in EEH aux components · def9d83d
      Gavin Shan 提交于
      The patch does some cleanup on the function names of EEH
      aux components. Currently, only couple of function names from
      eeh_cache have been adjusted so that:
      
              * The function name has prefix "eeh_addr_cache".
              * Move around pci_addr_cache_build() in the header file
                to reflect function call sequence.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      def9d83d
    • G
      powerpc/eeh: pseries platform EEH configure bridge · 1823fbf1
      Gavin Shan 提交于
      In order to enable particular PCI device, which has been included
      in the parent PE. The involved PCI bridges should be enabled explicitly
      if there has. On pSeries platform, there're dedicated RTAS calls
      to fulfil the purpose.
      
      The patch implements the function of configuring PCI bridges through
      the dedicated RTAS calls. Besides, the function has been abstracted
      by struct eeh_ops::configure_bridge so that the EEH core components
      could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1823fbf1
    • G
      powerpc/eeh: pseries platform EEH error log retrieval · 8d633291
      Gavin Shan 提交于
      On RTAS compliant pSeries platform, one dedicated RTAS call has
      been introduced to retrieve EEH temporary or permanent error log.
      
      The patch implements the function of retriving EEH error log through
      RTAS call. Besides, it has been abstracted by struct eeh_ops::get_log
      so that EEH core components could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8d633291
    • G
      powerpc/eeh: pseries platform EEH reset PE · 2652481f
      Gavin Shan 提交于
      On RTAS compliant pSeries platform, there is a dedicated RTAS call
      (ibm,set-slot-reset) to reset the specified PE. Furthermore, two
      types of resets are supported: hot and fundamental. the type of
      reset is to be used actually depends on the included PCI device's
      requirements.
      
      The patch implements resetting PE on pSeries platform through RTAS
      call. Besides, it has been abstracted through struct eeh_ops::reset
      so that EEH core components could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2652481f
    • G
      powerpc/eeh: pseries platform EEH wait PE state · b0e5f742
      Gavin Shan 提交于
      On pSeries platform, the PE state might be temporarily unavailable.
      In that case, the firmware will return the corresponding wait time.
      That means the kernel has to wait for appropriate time in order to
      get the PE state.
      
      The patch does the implementation for that. Besides, the function
      has been abstracted through struct eeh_ops::wait_state so that EEH core
      components could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b0e5f742
    • G
      powerpc/eeh: pseries platform PE state retrieval · eb594a47
      Gavin Shan 提交于
      On pSeries platform, there're 2 dedicated RTAS calls introduced to
      retrieve the corresponding PE's state: ibm,read-slot-reset-state and
      ibm,read-slot-reset-state2.
      
      The patch implements the retrieval of PE's state according to the
      given PE address. Besides, the implementation has been abstracted by
      struct eeh_ops::get_state so that EEH core components could support
      multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      eb594a47
    • G
      powerpc/eeh: pseries platform EEH PE address retrieval · c8c29b38
      Gavin Shan 提交于
      There're 2 types of addresses used for EEH operations. The first
      one would be BDF (Bus/Device/Function) address which is retrieved
      from the reg property of the corresponding FDT node. Another one
      is PE address that should be enquired from firmware through RTAS
      call on pSeries platform. When issuing EEH operation, the PE address
      has precedence over BDF address.
      
      The patch implements retrieving PE address according to the given
      BDF address on pSeries platform. Also, the struct eeh_early_enable_info
      has been removed since the information can be figured out from
      dn->pdn->phb->buid directly and that simplifies the code.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c8c29b38
    • G
      powerpc/eeh: pseries platform EEH operations · 8fb8f709
      Gavin Shan 提交于
      There're 4 EEH operations that are covered by the dedicated RTAS
      call <ibm,set-eeh-option>: enable or disable EEH, enable MMIO and
      enable DMA. At early stage of system boot, the EEH would be tried
      to enable on PCI device related device node. MMIO and DMA for
      particular PE should be enabled when doing recovery on EEH errors
      so that the PE could function properly again.
      
      The patch implements it and abstract that through struct
      eeh_ops::set_eeh. It would be help for EEH to support multiple
      platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8fb8f709
    • G
      powerpc/eeh: pseries platform EEH initialization · e2af155c
      Gavin Shan 提交于
      The platform specific EEH operations have been abstracted by
      struct eeh_ops. The individual platroms, including pSeries, needs
      doing necessary initialization before the platform dependent EEH
      operations work properly.
      
      The patch is addressing that and do necessary platform initialization
      for pSeries platform. More specificly, it will figure out the tokens
      of EEH related RTAS calls.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e2af155c
    • G
      powerpc/eeh: Platform dependent EEH operations · aa1e6374
      Gavin Shan 提交于
      EEH has been implemented on RTAS-compliant pSeries platform.
      That's to say, the EEH operations will be implemented through RTAS
      calls eventually. The situation limited feasible extension on EEH.
      In order to support EEH on multiple platforms like pseries and powernv
      simutaneously. We have to split the platform dependent EEH options
      up out of current implementation.
      
      The patch addresses supporting EEH on multiple platforms. The pseries
      platform dependent EEH operations will be abstracted by struct eeh_ops.
      EEH core components will be built based on the registered EEH operations.
      With the mechanism, what the individual platform needs to do is implement
      platform dependent EEH operations.
      
      For now, the pseries platform is covered under the mechanism. That means
      we have to think about other platforms to support EEH, like powernv.
      Besides, we only have framework for the mechanism and we have to implement
      it for pseries platform later.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      aa1e6374
    • G
      powerpc/eeh: Cleanup function names in the EEH core · cce4b2d2
      Gavin Shan 提交于
      The EEH has been implemented on pSeries platform. The original
      code looks a little bit nasty. The patch does cleanup on the
      current EEH implementation so that it looks more clean.
      
              * Try adding prefix "eeh" for functions.
              * Some function names have been adjusted so that they looks
                shorter and meaningful.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cce4b2d2
    • G
      powerpc/eeh: Cleanup comments in the EEH core · cb3bc9d0
      Gavin Shan 提交于
      The EEH has been implemented on pSeries platform. The original
      code looks a little bit nasty. The patch does cleanup on the
      current EEH implementation so that it looks more clean.
      
              * Duplicated comments have been removed from the corresponding
                header files.
              * Comments have been reorganized so that it looks more clean.
              * The leading comments of functions are adjusted for a little
                bit so that the result of "make pdfdocs" would be more
                unified.
              * Function definitions and calls have unified format as "xxx()".
                That means the format "xxx ()" has been replaced by "xxx()".
              * There're multiple functions implemented for resetting PE. The
                position of those functions have been move around so that they
                are adjacent to each other to reflect their relationship.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cb3bc9d0
  6. 14 2月, 2012 1 次提交
  7. 01 11月, 2011 2 次提交
    • P
      powerpc: Fix up implicit sched.h users · 62fe91bb
      Paul Gortmaker 提交于
      They are getting it through device.h --> module.h path, but we want
      to clean that up.  This is a sample of what will happen if we don't:
      
        pseries/iommu.c: In function 'tce_build_pSeriesLP':
        pseries/iommu.c:136: error: implicit declaration of function 'show_stack'
      
        pseries/eeh.c: In function 'eeh_token_to_phys':
        pseries/eeh.c:359: error: 'init_mm' undeclared (first use in this function)
      
        pseries/eeh_event.c: In function 'eeh_event_handler':
        pseries/eeh_event.c:63: error: implicit declaration of function 'daemonize'
        pseries/eeh_event.c:64: error: implicit declaration of function 'set_current_state'
        pseries/eeh_event.c:64: error: 'TASK_INTERRUPTIBLE' undeclared (first use in this function)
        pseries/eeh_event.c:64: error: (Each undeclared identifier is reported only once
        pseries/eeh_event.c:64: error: for each function it appears in.)
        pseries/eeh_event.c: In function 'eeh_thread_launcher':
        pseries/eeh_event.c:109: error: 'CLONE_KERNEL' undeclared (first use in this function)
      
        hotplug-cpu.c: In function 'pseries_mach_cpu_die':
        hotplug-cpu.c:115: error: implicit declaration of function 'idle_task_exit'
      
        kernel/swsusp_64.c: In function 'do_after_copyback':
        kernel/swsusp_64.c:17: error: implicit declaration of function 'touch_softlockup_watchdog'
      
        cell/spufs/context.c: In function 'alloc_spu_context':
        cell/spufs/context.c:60: error: implicit declaration of function 'get_task_mm'
        cell/spufs/context.c:60: warning: assignment makes pointer from integer without a cast
        cell/spufs/context.c: In function 'spu_forget':
        cell/spufs/context.c:127: error: implicit declaration of function 'mmput'
      
        pasemi/dma_lib.c: In function 'pasemi_dma_stop_chan':
        pasemi/dma_lib.c:332: error: implicit declaration of function 'cond_resched'
      
        sysdev/fsl_lbc.c: In function 'fsl_lbc_ctrl_irq':
        sysdev/fsl_lbc.c:247: error: 'TASK_NORMAL' undeclared (first use in this function)
      
      Add in sched.h so these get the definitions they are looking for.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      62fe91bb
    • P
      powerpc: add export.h to files making use of EXPORT_SYMBOL · 66b15db6
      Paul Gortmaker 提交于
      With module.h being implicitly everywhere via device.h, the absence
      of explicitly including something for EXPORT_SYMBOL went unnoticed.
      Since we are heading to fix things up and clean module.h from the
      device.h file, we need to explicitly include these files now.
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      66b15db6
  8. 20 9月, 2011 1 次提交
  9. 27 7月, 2011 1 次提交
  10. 04 5月, 2011 2 次提交
  11. 27 4月, 2011 1 次提交
  12. 31 3月, 2011 1 次提交
  13. 02 3月, 2011 1 次提交
  14. 18 11月, 2010 1 次提交
  15. 21 5月, 2010 1 次提交
    • A
      powerpc/eeh: Fix oops when probing in early boot · ce47c1c4
      Anton Blanchard 提交于
      If we take an EEH error early enough, we oops:
      
      Call Trace:
      [c000000010483770] [c000000000013ee4] .show_stack+0xd8/0x218 (unreliable)
      [c000000010483850] [c000000000658940] .dump_stack+0x28/0x3c
      [c0000000104838d0] [c000000000057a68] .eeh_dn_check_failure+0x2b8/0x304
      [c000000010483990] [c0000000000259c8] .rtas_read_config+0x120/0x168
      [c000000010483a40] [c000000000025af4] .rtas_pci_read_config+0xe4/0x124
      [c000000010483af0] [c00000000037af18] .pci_bus_read_config_word+0xac/0x104
      [c000000010483bc0] [c0000000008fec98] .pcibios_allocate_resources+0x7c/0x220
      [c000000010483c90] [c0000000008feed8] .pcibios_resource_survey+0x9c/0x418
      [c000000010483d80] [c0000000008fea10] .pcibios_init+0xbc/0xf4
      [c000000010483e20] [c000000000009844] .do_one_initcall+0x98/0x1d8
      [c000000010483ed0] [c0000000008f0560] .kernel_init+0x228/0x2e8
      [c000000010483f90] [c000000000031a08] .kernel_thread+0x54/0x70
      EEH: Detected PCI bus error on device <null>
      EEH: This PCI device has failed 1 times in the last hour:
      EEH: location=U78A5.001.WIH8464-P1 driver= pci addr=0001:00:01.0
      EEH: of node=/pci@800000020000209/usb@1
      EEH: PCI device/vendor: 00351033
      EEH: PCI cmd/status register: 12100146
      
      Unable to handle kernel paging request for data at address 0x00000468
      Oops: Kernel access of bad area, sig: 11 [#1]
      ....
      NIP [c000000000057610] .rtas_set_slot_reset+0x38/0x10c
      LR [c000000000058724] .eeh_reset_device+0x5c/0x124
      Call Trace:
      [c00000000bc6bd00] [c00000000005a0e0] .pcibios_remove_pci_devices+0x7c/0xb0 (unreliable)
      [c00000000bc6bd90] [c000000000058724] .eeh_reset_device+0x5c/0x124
      [c00000000bc6be40] [c0000000000589c0] .handle_eeh_events+0x1d4/0x39c
      [c00000000bc6bf00] [c000000000059124] .eeh_event_handler+0xf0/0x188
      [c00000000bc6bf90] [c000000000031a08] .kernel_thread+0x54/0x70
      
      We called rtas_set_slot_reset while scanning the bus and before the pci_dn
      to pcidev mapping has been created. Since we only need the pcidev to work
      out the type of reset and that only gets set after the module for the
      device loads, lets just do a hot reset if the pcidev is NULL.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NLinas Vepstas <linasvepstas@gmail.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ce47c1c4
  16. 19 2月, 2010 1 次提交
  17. 17 2月, 2010 1 次提交
    • B
      powerpc/eeh: Fix a bug when pci structure is null · 8d3d50bf
      Breno Leitao 提交于
      During a EEH recover, the pci_dev structure can be null, mainly if an
      eeh event is detected during cpi config operation. In this case, the
      pci_dev will not be known (and will be null) the kernel will crash
      with the following message:
      
      Unable to handle kernel paging request for data at address 0x000000a0
      Faulting instruction address: 0xc00000000006b8b4
      Oops: Kernel access of bad area, sig: 11 [#1]
      
      NIP [c00000000006b8b4] .eeh_event_handler+0x10c/0x1a0
      LR [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0
      Call Trace:
      [c0000003a80dff00] [c00000000006b8a8] .eeh_event_handler+0x100/0x1a0
      [c0000003a80dff90] [c000000000031f1c] .kernel_thread+0x54/0x70
      
      The bug occurs because pci_name() tries to access a null pointer.
      This patch just guarantee that pci_name() is not called on Null pointers.
      Signed-off-by: NBreno Leitao <leitao@linux.vnet.ibm.com>
      Signed-off-by: NLinas Vepstas <linasvepstas@gmail.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8d3d50bf
  18. 10 9月, 2009 1 次提交
  19. 06 11月, 2008 1 次提交