1. 16 3月, 2012 13 次提交
  2. 13 3月, 2012 1 次提交
  3. 09 3月, 2012 26 次提交
    • B
      powerpc: Rework lazy-interrupt handling · 7230c564
      Benjamin Herrenschmidt 提交于
      The current implementation of lazy interrupts handling has some
      issues that this tries to address.
      
      We don't do the various workarounds we need to do when re-enabling
      interrupts in some cases such as when returning from an interrupt
      and thus we may still lose or get delayed decrementer or doorbell
      interrupts.
      
      The current scheme also makes it much harder to handle the external
      "edge" interrupts provided by some BookE processors when using the
      EPR facility (External Proxy) and the Freescale Hypervisor.
      
      Additionally, we tend to keep interrupts hard disabled in a number
      of cases, such as decrementer interrupts, external interrupts, or
      when a masked decrementer interrupt is pending. This is sub-optimal.
      
      This is an attempt at fixing it all in one go by reworking the way
      we do the lazy interrupt disabling from the ground up.
      
      The base idea is to replace the "hard_enabled" field with a
      "irq_happened" field in which we store a bit mask of what interrupt
      occurred while soft-disabled.
      
      When re-enabling, either via arch_local_irq_restore() or when returning
      from an interrupt, we can now decide what to do by testing bits in that
      field.
      
      We then implement replaying of the missed interrupts either by
      re-using the existing exception frame (in exception exit case) or via
      the creation of a new one from an assembly trampoline (in the
      arch_local_irq_enable case).
      
      This removes the need to play with the decrementer to try to create
      fake interrupts, among others.
      
      In addition, this adds a few refinements:
      
       - We no longer  hard disable decrementer interrupts that occur
      while soft-disabled. We now simply bump the decrementer back to max
      (on BookS) or leave it stopped (on BookE) and continue with hard interrupts
      enabled, which means that we'll potentially get better sample quality from
      performance monitor interrupts.
      
       - Timer, decrementer and doorbell interrupts now hard-enable
      shortly after removing the source of the interrupt, which means
      they no longer run entirely hard disabled. Again, this will improve
      perf sample quality.
      
       - On Book3E 64-bit, we now make the performance monitor interrupt
      act as an NMI like Book3S (the necessary C code for that to work
      appear to already be present in the FSL perf code, notably calling
      nmi_enter instead of irq_enter). (This also fixes a bug where BookE
      perfmon interrupts could clobber r14 ... oops)
      
       - We could make "masked" decrementer interrupts act as NMIs when doing
      timer-based perf sampling to improve the sample quality.
      
      Signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      ---
      
      v2:
      
      - Add hard-enable to decrementer, timer and doorbells
      - Fix CR clobber in masked irq handling on BookE
      - Make embedded perf interrupt act as an NMI
      - Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want
        to retrigger an interrupt without preventing hard-enable
      
      v3:
      
       - Fix or vs. ori bug on Book3E
       - Fix enabling of interrupts for some exceptions on Book3E
      
      v4:
      
       - Fix resend of doorbells on return from interrupt on Book3E
      
      v5:
      
       - Rebased on top of my latest series, which involves some significant
      rework of some aspects of the patch.
      
      v6:
       - 32-bit compile fix
       - more compile fixes with various .config combos
       - factor out the asm code to soft-disable interrupts
       - remove the C wrapper around preempt_schedule_irq
      
      v7:
       - Fix a bug with hard irq state tracking on native power7
      7230c564
    • G
      powerpc/eeh: pseries platform config space access in EEH · 3780444c
      Gavin Shan 提交于
      With the original EEH implementation, the access to config space of
      the corresponding PCI device is done by RTAS sensitive function. That
      depends on pci_dn heavily. That would limit EEH extension to other
      platforms like powernv because other platforms might have different
      ways to access PCI config space.
      
      The patch splits those functions used to access PCI config space
      and implement them in platform related EEH component. It would be
      helpful to support EEH on multiple platforms simutaneously in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3780444c
    • G
      powerpc/eeh: Introduce struct eeh_stats for EEH · e575f8db
      Gavin Shan 提交于
      With the original EEH implementation, the EEH global statistics
      are maintained by individual global variables. That makes the
      code a little hard to maintain.
      
      The patch introduces extra struct eeh_stats for the EEH global
      statistics so that it can be maintained in collective fashion.
      
      It's the rework on the corresponding v5 patch. According to
      the comments from David Laight, the EEH global statistics have
      been changed for a litte bit so that they have fixed-type of
      "u64". Also, the format used to print them has been changed to
      "%llu" based on David's suggestion. Also, the output format of
      EEH global statistics should be kept as intacted according to
      Michael's suggestion that there might be tools parsing them.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e575f8db
    • G
      powerpc/eeh: Replace pci_dn with eeh_dev for EEH on pSeries · 54793d0e
      Gavin Shan 提交于
      The pci_dn has been replaced with eeh_dev. In order to comply with
      the rule, the EEH platform implementation on pSeries should also
      be adjusted for a little bit so that it will depend on eeh_dev instead
      of pci_dn.
      
      The patch replaces pci_dn with eeh_dev. The corresponding information
      will be retrieved from eeh_dev instead of pci_dn.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      54793d0e
    • G
      powerpc/eeh: Replace pci_dn with eeh_dev for EEH aux components · 40a7cd92
      Gavin Shan 提交于
      The original EEH implementation is heavily depending on struct pci_dn.
      We have to put EEH related information to pci_dn. Actually, we could
      split struct pci_dn so that the EEH sensitive information to form an
      individual struct, then EEH looks more independent.
      
      The patch replaces pci_dn with eeh_dev for EEH aux components like
      event and driver. Also, the eeh_event struct has been adjusted for
      a little bit since eeh_dev has linked the associated FDT (Flat Device
      Tree) node and PCI device. It's not necessary for eeh_event struct to
      trace FDT node and PCI device. We can just simply to trace eeh_dev in
      eeh_event.
      
      The patch also renames function pcid_name() to eeh_pcid_name(), which
      should be missed in the previous patch where the EEH aux components
      have been cleaned up.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      40a7cd92
    • G
      powerpc/eeh: Replace pci_dn with eeh_dev for EEH core · f631acd3
      Gavin Shan 提交于
      The original EEH implementation is heavily depending on struct pci_dn.
      We have to put EEH related information to pci_dn. Actually, we could
      split struct pci_dn so that the EEH sensitive information to form an
      individual struct, then EEH looks more independent.
      
      The patch replaces pci_dn with eeh_dev for EEH core.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f631acd3
    • G
      powerpc/eeh: Replace pci_dn with eeh_dev for EEH address cache · d50a7d4c
      Gavin Shan 提交于
      With original EEH implementation, struct pci_dn is used while building
      PCI I/O address cache, which helps on searching the corresponding
      PCI device according to the given physical I/O address. Besides, pci_dn
      is associated with the corresponding PCI device while building its
      I/O cache.
      
      The patch replaces struct pci_dn with struct eeh_dev so that EEH address
      cache won't depend on struct pci_dn. That will help EEH to become an
      independent module in future. Besides, the binding of eeh_dev and PCI
      device is done while building PCI device I/O cache.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d50a7d4c
    • G
      powerpc/eeh: Replace pci_dn with eeh_dev for EEH sysfs · 44da8edc
      Gavin Shan 提交于
      With original EEH implementation, all EEH related statistics have
      been put into struct pci_dn. We've introduced struct eeh_dev to
      replace struct pci_dn in EEH core components, including EEH sysfs
      component.
      
      The patch shows EEH statistics from struct eeh_dev instead of struct
      pci_dn in EEH sysfs component. Besides, it also fixed the EEH device
      retrieval from PCI device, which was introduced by the previous patch
      in the series of patch.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      44da8edc
    • G
      powerpc/eeh: Introduce EEH device · eb740b5f
      Gavin Shan 提交于
      Original EEH implementation depends on struct pci_dn heavily. However,
      EEH shouldn't depend on that actually because EEH needn't share much
      information with other PCI components. That's to say, EEH should have
      worked independently.
      
      The patch introduces struct eeh_dev so that EEH core components needn't
      be working based on struct pci_dn in future. Also, struct pci_dn, struct
      eeh_dev instances are created in dynamic fasion and the binding with EEH
      device, OF node, PCI device is implemented as well.
      
      The EEH devices are created after PHBs are detected and initialized, but
      PCI emunation hasn't started yet. Apart from that, PHB might be created
      dynamically through DLPAR component and the EEH devices should be creatd
      as well. Another case might be OF node is created dynamically by DR
      (Dynamic Reconfiguration), which has been defined by PAPR. For those OF
      nodes created by DR, EEH devices should be also created accordingly. The
      binding between EEH device and OF node is done while the EEH device is
      initially created.
      
      The binding between EEH device and PCI device should be done after PCI
      emunation is done. Besides, PCI hotplug also needs the binding so that
      the EEH devices could be traced from the newly coming PCI buses or PCI
      devices.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      eb740b5f
    • G
      powerpc/eeh: Cleanup function names in EEH aux components · def9d83d
      Gavin Shan 提交于
      The patch does some cleanup on the function names of EEH
      aux components. Currently, only couple of function names from
      eeh_cache have been adjusted so that:
      
              * The function name has prefix "eeh_addr_cache".
              * Move around pci_addr_cache_build() in the header file
                to reflect function call sequence.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      def9d83d
    • G
      powerpc/pseries: Cleanup comments in EEH aux components · 29f8bf1b
      Gavin Shan 提交于
      There're several EEH aux components and the patch does some cleanup
      for them so that they look more clean.
      
              * Duplicated comments have been removed from the header file.
              * Comments have been reorganized so that it looks more clean.
              * The leading comments of functions are adjusted for a little
                bit so that the result of "make pdfdocs" would be more
                unified.
              * Function calls "xxx ()" has been replaced by "xxx()".
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      29f8bf1b
    • G
      powerpc/eeh: pseries platform EEH configure bridge · 1823fbf1
      Gavin Shan 提交于
      In order to enable particular PCI device, which has been included
      in the parent PE. The involved PCI bridges should be enabled explicitly
      if there has. On pSeries platform, there're dedicated RTAS calls
      to fulfil the purpose.
      
      The patch implements the function of configuring PCI bridges through
      the dedicated RTAS calls. Besides, the function has been abstracted
      by struct eeh_ops::configure_bridge so that the EEH core components
      could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1823fbf1
    • G
      powerpc/eeh: pseries platform EEH error log retrieval · 8d633291
      Gavin Shan 提交于
      On RTAS compliant pSeries platform, one dedicated RTAS call has
      been introduced to retrieve EEH temporary or permanent error log.
      
      The patch implements the function of retriving EEH error log through
      RTAS call. Besides, it has been abstracted by struct eeh_ops::get_log
      so that EEH core components could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8d633291
    • G
      powerpc/eeh: pseries platform EEH reset PE · 2652481f
      Gavin Shan 提交于
      On RTAS compliant pSeries platform, there is a dedicated RTAS call
      (ibm,set-slot-reset) to reset the specified PE. Furthermore, two
      types of resets are supported: hot and fundamental. the type of
      reset is to be used actually depends on the included PCI device's
      requirements.
      
      The patch implements resetting PE on pSeries platform through RTAS
      call. Besides, it has been abstracted through struct eeh_ops::reset
      so that EEH core components could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2652481f
    • G
      powerpc/eeh: pseries platform EEH wait PE state · b0e5f742
      Gavin Shan 提交于
      On pSeries platform, the PE state might be temporarily unavailable.
      In that case, the firmware will return the corresponding wait time.
      That means the kernel has to wait for appropriate time in order to
      get the PE state.
      
      The patch does the implementation for that. Besides, the function
      has been abstracted through struct eeh_ops::wait_state so that EEH core
      components could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b0e5f742
    • G
      powerpc/eeh: pseries platform PE state retrieval · eb594a47
      Gavin Shan 提交于
      On pSeries platform, there're 2 dedicated RTAS calls introduced to
      retrieve the corresponding PE's state: ibm,read-slot-reset-state and
      ibm,read-slot-reset-state2.
      
      The patch implements the retrieval of PE's state according to the
      given PE address. Besides, the implementation has been abstracted by
      struct eeh_ops::get_state so that EEH core components could support
      multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      eb594a47
    • G
      powerpc/eeh: pseries platform EEH PE address retrieval · c8c29b38
      Gavin Shan 提交于
      There're 2 types of addresses used for EEH operations. The first
      one would be BDF (Bus/Device/Function) address which is retrieved
      from the reg property of the corresponding FDT node. Another one
      is PE address that should be enquired from firmware through RTAS
      call on pSeries platform. When issuing EEH operation, the PE address
      has precedence over BDF address.
      
      The patch implements retrieving PE address according to the given
      BDF address on pSeries platform. Also, the struct eeh_early_enable_info
      has been removed since the information can be figured out from
      dn->pdn->phb->buid directly and that simplifies the code.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c8c29b38
    • G
      powerpc/eeh: pseries platform EEH operations · 8fb8f709
      Gavin Shan 提交于
      There're 4 EEH operations that are covered by the dedicated RTAS
      call <ibm,set-eeh-option>: enable or disable EEH, enable MMIO and
      enable DMA. At early stage of system boot, the EEH would be tried
      to enable on PCI device related device node. MMIO and DMA for
      particular PE should be enabled when doing recovery on EEH errors
      so that the PE could function properly again.
      
      The patch implements it and abstract that through struct
      eeh_ops::set_eeh. It would be help for EEH to support multiple
      platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8fb8f709
    • G
      powerpc/eeh: pseries platform EEH initialization · e2af155c
      Gavin Shan 提交于
      The platform specific EEH operations have been abstracted by
      struct eeh_ops. The individual platroms, including pSeries, needs
      doing necessary initialization before the platform dependent EEH
      operations work properly.
      
      The patch is addressing that and do necessary platform initialization
      for pSeries platform. More specificly, it will figure out the tokens
      of EEH related RTAS calls.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e2af155c
    • G
      powerpc/eeh: Platform dependent EEH operations · aa1e6374
      Gavin Shan 提交于
      EEH has been implemented on RTAS-compliant pSeries platform.
      That's to say, the EEH operations will be implemented through RTAS
      calls eventually. The situation limited feasible extension on EEH.
      In order to support EEH on multiple platforms like pseries and powernv
      simutaneously. We have to split the platform dependent EEH options
      up out of current implementation.
      
      The patch addresses supporting EEH on multiple platforms. The pseries
      platform dependent EEH operations will be abstracted by struct eeh_ops.
      EEH core components will be built based on the registered EEH operations.
      With the mechanism, what the individual platform needs to do is implement
      platform dependent EEH operations.
      
      For now, the pseries platform is covered under the mechanism. That means
      we have to think about other platforms to support EEH, like powernv.
      Besides, we only have framework for the mechanism and we have to implement
      it for pseries platform later.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      aa1e6374
    • G
      powerpc/eeh: Cleanup function names in the EEH core · cce4b2d2
      Gavin Shan 提交于
      The EEH has been implemented on pSeries platform. The original
      code looks a little bit nasty. The patch does cleanup on the
      current EEH implementation so that it looks more clean.
      
              * Try adding prefix "eeh" for functions.
              * Some function names have been adjusted so that they looks
                shorter and meaningful.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cce4b2d2
    • G
      powerpc/eeh: Cleanup comments in the EEH core · cb3bc9d0
      Gavin Shan 提交于
      The EEH has been implemented on pSeries platform. The original
      code looks a little bit nasty. The patch does cleanup on the
      current EEH implementation so that it looks more clean.
      
              * Duplicated comments have been removed from the corresponding
                header files.
              * Comments have been reorganized so that it looks more clean.
              * The leading comments of functions are adjusted for a little
                bit so that the result of "make pdfdocs" would be more
                unified.
              * Function definitions and calls have unified format as "xxx()".
                That means the format "xxx ()" has been replaced by "xxx()".
              * There're multiple functions implemented for resetting PE. The
                position of those functions have been move around so that they
                are adjacent to each other to reflect their relationship.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cb3bc9d0
    • B
      powerpc: Replace mfmsr instructions with load from PACA kernel_msr field · d9ada91a
      Benjamin Herrenschmidt 提交于
      On 64-bit, the mfmsr instruction can be quite slow, slower
      than loading a field from the cache-hot PACA, which happens
      to already contain the value we want in most cases.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d9ada91a
    • B
      powerpc: Fix 64-bit BookE FP unavailable exceptions · 9424fabf
      Benjamin Herrenschmidt 提交于
      We were using CR0.EQ after EXCEPTION_COMMON, hoping it still
      contained whether we came from userspace or kernel space.
      
      However, under some circumstances, EXCEPTION_COMMON will
      call C code and clobber non-volatile registers, so we really
      need to re-load the previous MSR from the stackframe and
      re-test.
      
      While there, invert the condition to make the fast path more
      obvious and remove the BUG_OPCODE which was a debugging
      leftover and call .ret_from_except as we should.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9424fabf
    • B
      powerpc: Fix register clobbering when accumulating stolen time · 990118c8
      Benjamin Herrenschmidt 提交于
      When running under a hypervisor that supports stolen time accounting,
      we may call C code from the macro EXCEPTION_PROLOG_COMMON in the
      exception entry path, which clobbers CR0.
      
      However, the FPU and vector traps rely on CR0 indicating whether we
      are coming from userspace or kernel to decide what to do.
      
      So we need to restore that value after the C call
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      990118c8
    • B
      powerpc/xmon: Add display of soft & hard irq states · 7ac21cd4
      Benjamin Herrenschmidt 提交于
      Also use local_paca instead of get_paca() to avoid getting into
      the smp_processor_id() debugging code from the debugger
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7ac21cd4