1. 17 3月, 2012 2 次提交
  2. 16 3月, 2012 2 次提交
  3. 09 3月, 2012 24 次提交
    • B
      powerpc: Rework lazy-interrupt handling · 7230c564
      Benjamin Herrenschmidt 提交于
      The current implementation of lazy interrupts handling has some
      issues that this tries to address.
      
      We don't do the various workarounds we need to do when re-enabling
      interrupts in some cases such as when returning from an interrupt
      and thus we may still lose or get delayed decrementer or doorbell
      interrupts.
      
      The current scheme also makes it much harder to handle the external
      "edge" interrupts provided by some BookE processors when using the
      EPR facility (External Proxy) and the Freescale Hypervisor.
      
      Additionally, we tend to keep interrupts hard disabled in a number
      of cases, such as decrementer interrupts, external interrupts, or
      when a masked decrementer interrupt is pending. This is sub-optimal.
      
      This is an attempt at fixing it all in one go by reworking the way
      we do the lazy interrupt disabling from the ground up.
      
      The base idea is to replace the "hard_enabled" field with a
      "irq_happened" field in which we store a bit mask of what interrupt
      occurred while soft-disabled.
      
      When re-enabling, either via arch_local_irq_restore() or when returning
      from an interrupt, we can now decide what to do by testing bits in that
      field.
      
      We then implement replaying of the missed interrupts either by
      re-using the existing exception frame (in exception exit case) or via
      the creation of a new one from an assembly trampoline (in the
      arch_local_irq_enable case).
      
      This removes the need to play with the decrementer to try to create
      fake interrupts, among others.
      
      In addition, this adds a few refinements:
      
       - We no longer  hard disable decrementer interrupts that occur
      while soft-disabled. We now simply bump the decrementer back to max
      (on BookS) or leave it stopped (on BookE) and continue with hard interrupts
      enabled, which means that we'll potentially get better sample quality from
      performance monitor interrupts.
      
       - Timer, decrementer and doorbell interrupts now hard-enable
      shortly after removing the source of the interrupt, which means
      they no longer run entirely hard disabled. Again, this will improve
      perf sample quality.
      
       - On Book3E 64-bit, we now make the performance monitor interrupt
      act as an NMI like Book3S (the necessary C code for that to work
      appear to already be present in the FSL perf code, notably calling
      nmi_enter instead of irq_enter). (This also fixes a bug where BookE
      perfmon interrupts could clobber r14 ... oops)
      
       - We could make "masked" decrementer interrupts act as NMIs when doing
      timer-based perf sampling to improve the sample quality.
      
      Signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      ---
      
      v2:
      
      - Add hard-enable to decrementer, timer and doorbells
      - Fix CR clobber in masked irq handling on BookE
      - Make embedded perf interrupt act as an NMI
      - Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want
        to retrigger an interrupt without preventing hard-enable
      
      v3:
      
       - Fix or vs. ori bug on Book3E
       - Fix enabling of interrupts for some exceptions on Book3E
      
      v4:
      
       - Fix resend of doorbells on return from interrupt on Book3E
      
      v5:
      
       - Rebased on top of my latest series, which involves some significant
      rework of some aspects of the patch.
      
      v6:
       - 32-bit compile fix
       - more compile fixes with various .config combos
       - factor out the asm code to soft-disable interrupts
       - remove the C wrapper around preempt_schedule_irq
      
      v7:
       - Fix a bug with hard irq state tracking on native power7
      7230c564
    • G
      powerpc/eeh: pseries platform config space access in EEH · 3780444c
      Gavin Shan 提交于
      With the original EEH implementation, the access to config space of
      the corresponding PCI device is done by RTAS sensitive function. That
      depends on pci_dn heavily. That would limit EEH extension to other
      platforms like powernv because other platforms might have different
      ways to access PCI config space.
      
      The patch splits those functions used to access PCI config space
      and implement them in platform related EEH component. It would be
      helpful to support EEH on multiple platforms simutaneously in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3780444c
    • G
      powerpc/eeh: Replace pci_dn with eeh_dev for EEH aux components · 40a7cd92
      Gavin Shan 提交于
      The original EEH implementation is heavily depending on struct pci_dn.
      We have to put EEH related information to pci_dn. Actually, we could
      split struct pci_dn so that the EEH sensitive information to form an
      individual struct, then EEH looks more independent.
      
      The patch replaces pci_dn with eeh_dev for EEH aux components like
      event and driver. Also, the eeh_event struct has been adjusted for
      a little bit since eeh_dev has linked the associated FDT (Flat Device
      Tree) node and PCI device. It's not necessary for eeh_event struct to
      trace FDT node and PCI device. We can just simply to trace eeh_dev in
      eeh_event.
      
      The patch also renames function pcid_name() to eeh_pcid_name(), which
      should be missed in the previous patch where the EEH aux components
      have been cleaned up.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      40a7cd92
    • G
      powerpc/eeh: Replace pci_dn with eeh_dev for EEH core · f631acd3
      Gavin Shan 提交于
      The original EEH implementation is heavily depending on struct pci_dn.
      We have to put EEH related information to pci_dn. Actually, we could
      split struct pci_dn so that the EEH sensitive information to form an
      individual struct, then EEH looks more independent.
      
      The patch replaces pci_dn with eeh_dev for EEH core.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f631acd3
    • G
      powerpc/eeh: Introduce EEH device · eb740b5f
      Gavin Shan 提交于
      Original EEH implementation depends on struct pci_dn heavily. However,
      EEH shouldn't depend on that actually because EEH needn't share much
      information with other PCI components. That's to say, EEH should have
      worked independently.
      
      The patch introduces struct eeh_dev so that EEH core components needn't
      be working based on struct pci_dn in future. Also, struct pci_dn, struct
      eeh_dev instances are created in dynamic fasion and the binding with EEH
      device, OF node, PCI device is implemented as well.
      
      The EEH devices are created after PHBs are detected and initialized, but
      PCI emunation hasn't started yet. Apart from that, PHB might be created
      dynamically through DLPAR component and the EEH devices should be creatd
      as well. Another case might be OF node is created dynamically by DR
      (Dynamic Reconfiguration), which has been defined by PAPR. For those OF
      nodes created by DR, EEH devices should be also created accordingly. The
      binding between EEH device and OF node is done while the EEH device is
      initially created.
      
      The binding between EEH device and PCI device should be done after PCI
      emunation is done. Besides, PCI hotplug also needs the binding so that
      the EEH devices could be traced from the newly coming PCI buses or PCI
      devices.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      eb740b5f
    • G
      powerpc/eeh: Cleanup function names in EEH aux components · def9d83d
      Gavin Shan 提交于
      The patch does some cleanup on the function names of EEH
      aux components. Currently, only couple of function names from
      eeh_cache have been adjusted so that:
      
              * The function name has prefix "eeh_addr_cache".
              * Move around pci_addr_cache_build() in the header file
                to reflect function call sequence.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      def9d83d
    • G
      powerpc/pseries: Cleanup comments in EEH aux components · 29f8bf1b
      Gavin Shan 提交于
      There're several EEH aux components and the patch does some cleanup
      for them so that they look more clean.
      
              * Duplicated comments have been removed from the header file.
              * Comments have been reorganized so that it looks more clean.
              * The leading comments of functions are adjusted for a little
                bit so that the result of "make pdfdocs" would be more
                unified.
              * Function calls "xxx ()" has been replaced by "xxx()".
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      29f8bf1b
    • G
      powerpc/eeh: pseries platform EEH configure bridge · 1823fbf1
      Gavin Shan 提交于
      In order to enable particular PCI device, which has been included
      in the parent PE. The involved PCI bridges should be enabled explicitly
      if there has. On pSeries platform, there're dedicated RTAS calls
      to fulfil the purpose.
      
      The patch implements the function of configuring PCI bridges through
      the dedicated RTAS calls. Besides, the function has been abstracted
      by struct eeh_ops::configure_bridge so that the EEH core components
      could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1823fbf1
    • G
      powerpc/eeh: pseries platform EEH error log retrieval · 8d633291
      Gavin Shan 提交于
      On RTAS compliant pSeries platform, one dedicated RTAS call has
      been introduced to retrieve EEH temporary or permanent error log.
      
      The patch implements the function of retriving EEH error log through
      RTAS call. Besides, it has been abstracted by struct eeh_ops::get_log
      so that EEH core components could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8d633291
    • G
      powerpc/eeh: pseries platform EEH reset PE · 2652481f
      Gavin Shan 提交于
      On RTAS compliant pSeries platform, there is a dedicated RTAS call
      (ibm,set-slot-reset) to reset the specified PE. Furthermore, two
      types of resets are supported: hot and fundamental. the type of
      reset is to be used actually depends on the included PCI device's
      requirements.
      
      The patch implements resetting PE on pSeries platform through RTAS
      call. Besides, it has been abstracted through struct eeh_ops::reset
      so that EEH core components could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2652481f
    • G
      powerpc/eeh: pseries platform EEH wait PE state · b0e5f742
      Gavin Shan 提交于
      On pSeries platform, the PE state might be temporarily unavailable.
      In that case, the firmware will return the corresponding wait time.
      That means the kernel has to wait for appropriate time in order to
      get the PE state.
      
      The patch does the implementation for that. Besides, the function
      has been abstracted through struct eeh_ops::wait_state so that EEH core
      components could support multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b0e5f742
    • G
      powerpc/eeh: pseries platform PE state retrieval · eb594a47
      Gavin Shan 提交于
      On pSeries platform, there're 2 dedicated RTAS calls introduced to
      retrieve the corresponding PE's state: ibm,read-slot-reset-state and
      ibm,read-slot-reset-state2.
      
      The patch implements the retrieval of PE's state according to the
      given PE address. Besides, the implementation has been abstracted by
      struct eeh_ops::get_state so that EEH core components could support
      multiple platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      eb594a47
    • G
      powerpc/eeh: pseries platform EEH operations · 8fb8f709
      Gavin Shan 提交于
      There're 4 EEH operations that are covered by the dedicated RTAS
      call <ibm,set-eeh-option>: enable or disable EEH, enable MMIO and
      enable DMA. At early stage of system boot, the EEH would be tried
      to enable on PCI device related device node. MMIO and DMA for
      particular PE should be enabled when doing recovery on EEH errors
      so that the PE could function properly again.
      
      The patch implements it and abstract that through struct
      eeh_ops::set_eeh. It would be help for EEH to support multiple
      platforms in future.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8fb8f709
    • G
      powerpc/eeh: Platform dependent EEH operations · aa1e6374
      Gavin Shan 提交于
      EEH has been implemented on RTAS-compliant pSeries platform.
      That's to say, the EEH operations will be implemented through RTAS
      calls eventually. The situation limited feasible extension on EEH.
      In order to support EEH on multiple platforms like pseries and powernv
      simutaneously. We have to split the platform dependent EEH options
      up out of current implementation.
      
      The patch addresses supporting EEH on multiple platforms. The pseries
      platform dependent EEH operations will be abstracted by struct eeh_ops.
      EEH core components will be built based on the registered EEH operations.
      With the mechanism, what the individual platform needs to do is implement
      platform dependent EEH operations.
      
      For now, the pseries platform is covered under the mechanism. That means
      we have to think about other platforms to support EEH, like powernv.
      Besides, we only have framework for the mechanism and we have to implement
      it for pseries platform later.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      aa1e6374
    • G
      powerpc/eeh: Cleanup function names in the EEH core · cce4b2d2
      Gavin Shan 提交于
      The EEH has been implemented on pSeries platform. The original
      code looks a little bit nasty. The patch does cleanup on the
      current EEH implementation so that it looks more clean.
      
              * Try adding prefix "eeh" for functions.
              * Some function names have been adjusted so that they looks
                shorter and meaningful.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cce4b2d2
    • G
      powerpc/eeh: Cleanup comments in the EEH core · cb3bc9d0
      Gavin Shan 提交于
      The EEH has been implemented on pSeries platform. The original
      code looks a little bit nasty. The patch does cleanup on the
      current EEH implementation so that it looks more clean.
      
              * Duplicated comments have been removed from the corresponding
                header files.
              * Comments have been reorganized so that it looks more clean.
              * The leading comments of functions are adjusted for a little
                bit so that the result of "make pdfdocs" would be more
                unified.
              * Function definitions and calls have unified format as "xxx()".
                That means the format "xxx ()" has been replaced by "xxx()".
              * There're multiple functions implemented for resetting PE. The
                position of those functions have been move around so that they
                are adjacent to each other to reflect their relationship.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      cb3bc9d0
    • B
      powerpc: Replace mfmsr instructions with load from PACA kernel_msr field · d9ada91a
      Benjamin Herrenschmidt 提交于
      On 64-bit, the mfmsr instruction can be quite slow, slower
      than loading a field from the cache-hot PACA, which happens
      to already contain the value we want in most cases.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d9ada91a
    • B
      powerpc: Fix register clobbering when accumulating stolen time · 990118c8
      Benjamin Herrenschmidt 提交于
      When running under a hypervisor that supports stolen time accounting,
      we may call C code from the macro EXCEPTION_PROLOG_COMMON in the
      exception entry path, which clobbers CR0.
      
      However, the FPU and vector traps rely on CR0 indicating whether we
      are coming from userspace or kernel to decide what to do.
      
      So we need to restore that value after the C call
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      990118c8
    • B
      powerpc: Call do_page_fault() with interrupts off · a546498f
      Benjamin Herrenschmidt 提交于
      We currently turn interrupts back to their previous state before
      calling do_page_fault(). This can be annoying when debugging as
      a bad fault will potentially have lost some processor state before
      getting into the debugger.
      
      We also end up calling some generic code with interrupts enabled
      such as notify_page_fault() with interrupts enabled, which could
      be unexpected.
      
      This changes our code to behave more like other architectures,
      and make the assembly entry code call into do_page_faults() with
      interrupts disabled. They are conditionally re-enabled from
      within do_page_fault() in the same spot x86 does it.
      
      While there, add the might_sleep() test in the case of a successful
      trylock of the mmap semaphore, again like x86.
      
      Also fix a bug in the existing assembly where r12 (_MSR) could get
      clobbered by C calls (the DTL accounting in the exception common
      macro and DISABLE_INTS) in some cases.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ---
      
      v2. Add the r12 clobber fix
      a546498f
    • B
      powerpc: Improve behaviour of irq tracing on 64-bit exception entry · 1b701179
      Benjamin Herrenschmidt 提交于
      Some exceptions would unconditionally disable interrupts on entry,
      which is fine, but calling lockdep every time not only adds more
      overhead than strictly needed, but also means we get quite a few
      "redudant" disable logged, which makes it hard to spot the really
      bad ones.
      
      So instead, split the macro used by the exception code into a
      normal one and a separate one used when CONFIG_TRACE_IRQFLAGS is
      enabled, and make the later skip th tracing if interrupts were
      already disabled.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1b701179
    • B
      powerpc: Rework runlatch code · fe1952fc
      Benjamin Herrenschmidt 提交于
      This moves the inlines into system.h and changes the runlatch
      code to use the thread local flags (non-atomic) rather than
      the TIF flags (atomic) to keep track of the latch state.
      
      The code to turn it back on in an asynchronous interrupt is
      now simplified and partially inlined.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      fe1952fc
    • B
      powerpc: Use the same interrupt prolog for perfmon as other interrupts · 7450f6f0
      Benjamin Herrenschmidt 提交于
      The perfmon interrupt is the sole user of a special variant of the
      interrupt prolog which differs from the one used by external and timer
      interrupts in that it saves the non-volatile GPRs and doesn't turn the
      runlatch on.
      
      The former is unnecessary and the later is arguably incorrect, so
      let's clean that up by using the same prolog. While at it we rename
      that prolog to use the _ASYNC prefix.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      7450f6f0
    • B
      powerpc: Remove legacy iSeries bits from assembly files · 4f8cf36f
      Benjamin Herrenschmidt 提交于
      This removes the various bits of assembly in the kernel entry,
      exception handling and SLB management code that were specific
      to running under the legacy iSeries hypervisor which is no
      longer supported.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      4f8cf36f
    • S
      powerpc: clean up vio.c · b0787660
      Stephen Rothwell 提交于
      This cleans up vio.c after the removal of the legacy iSeries platform.
      It also removes some no longer referenced include files.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b0787660
  4. 07 3月, 2012 1 次提交
  5. 23 2月, 2012 8 次提交
    • M
      fadump: Remove the phyp assisted dump code. · 12d92992
      Mahesh Salgaonkar 提交于
      Remove the phyp assisted dump implementation which is not is use.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      12d92992
    • M
      fadump: Invalidate registration and release reserved memory for general use. · b500afff
      Mahesh Salgaonkar 提交于
      This patch introduces an sysfs interface '/sys/kernel/fadump_release_mem' to
      invalidate the last fadump registration, invalidate '/proc/vmcore', release
      the reserved memory for general use and re-register for future kernel dump.
      Once the dump is copied to the disk, unlike phyp dump, the userspace tool
      can release all the memory reserved for dump with one single operation of
      echo 1 to '/sys/kernel/fadump_release_mem'.
      
      Release the reserved memory region excluding the size of the memory required
      for future kernel dump registration. And therefore, unlike kdump, Fadump
      doesn't need a 2nd reboot to get back the system to the production
      configuration.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b500afff
    • M
      fadump: Convert firmware-assisted cpu state dump data into elf notes. · ebaeb5ae
      Mahesh Salgaonkar 提交于
      When registered for firmware assisted dump on powerpc, firmware preserves
      the registers for the active CPUs during a system crash. This patch reads
      the cpu register data stored in Firmware-assisted dump format (except for
      crashing cpu) and converts it into elf notes and updates the PT_NOTE program
      header accordingly. The exact register state for crashing cpu is saved to
      fadump crash info structure in scratch area during crash_fadump() and read
      during second kernel boot.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ebaeb5ae
    • M
      fadump: Initialize elfcore header and add PT_LOAD program headers. · 2df173d9
      Mahesh Salgaonkar 提交于
      Build the crash memory range list by traversing through system memory during
      the first kernel before we register for firmware-assisted dump. After the
      successful dump registration, initialize the elfcore header and populate
      PT_LOAD program headers with crash memory ranges. The elfcore header is
      saved in the scratch area within the reserved memory. The scratch area starts
      at the end of the memory reserved for saving RMR region contents. The
      scratch area contains fadump crash info structure that contains magic number
      for fadump validation and physical address where the eflcore header can be
      found. This structure will also be used to pass some important crash info
      data to the second kernel which will help second kernel to populate ELF core
      header with correct data before it gets exported through /proc/vmcore. Since
      the firmware preserves the entire partition memory at the time of crash the
      contents of the scratch area will be preserved till second kernel boot.
      
      Since the memory dump exported through /proc/vmcore is in ELF format similar
      to kdump, it will help us to reuse the kdump infrastructure for dump capture
      and filtering. Unlike phyp dump, userspace tool does not need to refer any
      sysfs interface while reading /proc/vmcore.
      
      NOTE: The current design implementation does not address a possibility of
      introducing additional fields (in future) to this structure without affecting
      compatibility. It's on TODO list to come up with better approach to
      address this.
      
      Reserved dump area start => +-------------------------------------+
                                  |  CPU state dump data                |
                                  +-------------------------------------+
                                  |  HPTE region data                   |
                                  +-------------------------------------+
                                  |  RMR region data                    |
      Scratch area start       => +-------------------------------------+
                                  |  fadump crash info structure {      |
                                  |     magic nummber                   |
                           +------|---- elfcorehdr_addr                 |
                           |      |  }                                  |
                           +----> +-------------------------------------+
                                  |  ELF core header                    |
      Reserved dump area end   => +-------------------------------------+
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2df173d9
    • M
      fadump: Register for firmware assisted dump. · 3ccc00a7
      Mahesh Salgaonkar 提交于
      On 2012-02-20 11:02:51 Mon, Paul Mackerras wrote:
      > On Thu, Feb 16, 2012 at 04:44:30PM +0530, Mahesh J Salgaonkar wrote:
      >
      > If I have read the code correctly, we are going to get this printk on
      > non-pSeries machines or on older pSeries machines, even if the user
      > has not put the fadump=on option on the kernel command line.  The
      > printk will be annoying since there is no actual error condition.  It
      > seems to me that the condition for the printk should include
      > fw_dump.fadump_enabled.  In other words you should probably add
      >
      > 	if (!fw_dump.fadump_enabled)
      > 		return 0;
      >
      > at the beginning of the function.
      
      Hi Paul,
      
      Thanks for pointing it out. Please find the updated patch below.
      
      The existing patches above this (4/10 through 10/10) cleanly applies
      on this update.
      
      Thanks,
      -Mahesh.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3ccc00a7
    • M
      fadump: Reserve the memory for firmware assisted dump. · eb39c880
      Mahesh Salgaonkar 提交于
      Reserve the memory during early boot to preserve CPU state data, HPTE region
      and RMA (real mode area) region data in case of kernel crash. At the time of
      crash, powerpc firmware will store CPU state data, HPTE region data and move
      RMA region data to the reserved memory area.
      
      If the firmware-assisted dump fails to reserve the memory, then fallback
      to existing kexec-based kdump.
      
      Most of the code implementation to reserve memory has been
      adapted from phyp assisted dump implementation written by Linas Vepstas
      and Manish Ahuja
      
      This patch also introduces a config option CONFIG_FA_DUMP for firmware
      assisted dump feature on Powerpc (ppc64) architecture.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      eb39c880
    • K
      powerpc/mpic: Remove duplicate MPIC_WANTS_RESET flag · e55d7f73
      Kyle Moffett 提交于
      There are two separate flags controlling whether or not the MPIC is
      reset during initialization, which is completely unnecessary, and only
      one of them can be specified in the device tree.
      
      Also, most platforms in-tree right now do actually want to reset the
      MPIC during initialization anyways, which means lots of duplicate code
      passing the MPIC_WANTS_RESET flag.
      
      Fix all of the callers which currently do not pass the MPIC_WANTS_RESET
      flag to pass the MPIC_NO_RESET flag, then remove the MPIC_WANTS_RESET
      flag and make the code reset the MPIC by default.
      Signed-off-by: NKyle Moffett <Kyle.D.Moffett@boeing.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e55d7f73
    • K
      powerpc/mpic: Remove MPIC_BROKEN_FRR_NIRQS and duplicate irq_count · 5019609f
      Kyle Moffett 提交于
      The mpic->irq_count variable is only used as a software error-checking
      limit to determine whether or not an IRQ number is valid.  In board code
      which does not manually specify an IRQ count to mpic_alloc(), i.e. 0, it
      is automatically detected from the number of ISUs and the ISU size.
      
      In practice, all hardware ends up with irq_count == num_sources, so all
      of the runtime checks on mpic->irq_count should just check the value of
      mpic->num_sources instead.
      
      When platform hardware does not correctly report the number of IRQs,
      which only happens on the MPC85xx/MPC86xx, the MPIC_BROKEN_FRR_NIRQS
      flag is used to override the detected value of num_sources with the
      manual irq_count parameter.  Since there's no need to manually specify
      the number of IRQs except in this case, the extra flag can be eliminated
      and the test changed to "irq_count != 0".
      Signed-off-by: NKyle Moffett <Kyle.D.Moffett@boeing.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5019609f
  6. 14 2月, 2012 2 次提交
  7. 18 1月, 2012 1 次提交
    • E
      Audit: push audit success and retcode into arch ptrace.h · d7e7528b
      Eric Paris 提交于
      The audit system previously expected arches calling to audit_syscall_exit to
      supply as arguments if the syscall was a success and what the return code was.
      Audit also provides a helper AUDITSC_RESULT which was supposed to simplify things
      by converting from negative retcodes to an audit internal magic value stating
      success or failure.  This helper was wrong and could indicate that a valid
      pointer returned to userspace was a failed syscall.  The fix is to fix the
      layering foolishness.  We now pass audit_syscall_exit a struct pt_reg and it
      in turns calls back into arch code to collect the return value and to
      determine if the syscall was a success or failure.  We also define a generic
      is_syscall_success() macro which determines success/failure based on if the
      value is < -MAX_ERRNO.  This works for arches like x86 which do not use a
      separate mechanism to indicate syscall failure.
      
      We make both the is_syscall_success() and regs_return_value() static inlines
      instead of macros.  The reason is because the audit function must take a void*
      for the regs.  (uml calls theirs struct uml_pt_regs instead of just struct
      pt_regs so audit_syscall_exit can't take a struct pt_regs).  Since the audit
      function takes a void* we need to use static inlines to cast it back to the
      arch correct structure to dereference it.
      
      The other major change is that on some arches, like ia64, MIPS and ppc, we
      change regs_return_value() to give us the negative value on syscall failure.
      THE only other user of this macro, kretprobe_example.c, won't notice and it
      makes the value signed consistently for the audit functions across all archs.
      
      In arch/sh/kernel/ptrace_64.c I see that we were using regs[9] in the old
      audit code as the return value.  But the ptrace_64.h code defined the macro
      regs_return_value() as regs[3].  I have no idea which one is correct, but this
      patch now uses the regs_return_value() function, so it now uses regs[3].
      
      For powerpc we previously used regs->result but now use the
      regs_return_value() function which uses regs->gprs[3].  regs->gprs[3] is
      always positive so the regs_return_value(), much like ia64 makes it negative
      before calling the audit code when appropriate.
      Signed-off-by: NEric Paris <eparis@redhat.com>
      Acked-by: H. Peter Anvin <hpa@zytor.com> [for x86 portion]
      Acked-by: Tony Luck <tony.luck@intel.com> [for ia64]
      Acked-by: Richard Weinberger <richard@nod.at> [for uml]
      Acked-by: David S. Miller <davem@davemloft.net> [for sparc]
      Acked-by: Ralf Baechle <ralf@linux-mips.org> [for mips]
      Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [for ppc]
      d7e7528b