1. 09 3月, 2012 3 次提交
  2. 07 3月, 2012 7 次提交
  3. 27 2月, 2012 4 次提交
  4. 23 2月, 2012 15 次提交
    • M
      powerpc/perf: Move perf core & PMU code into a subdirectory · f2699491
      Michael Ellerman 提交于
      The perf code has grown a lot since it started, and is big enough to
      warrant its own subdirectory. For reference it's ~60% bigger than the
      oprofile code. It declutters the kernel directory, makes it simpler to
      grep for "just perf stuff", and allows us to shorten some filenames.
      
      While we're at it, make it more obvious that we have two implementations
      of the core perf logic. One for (roughly) Book3S CPUs, which was the
      original implementation, and the other for Freescale embedded CPUs.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f2699491
    • M
      fadump: Remove the phyp assisted dump code. · 12d92992
      Mahesh Salgaonkar 提交于
      Remove the phyp assisted dump implementation which is not is use.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      12d92992
    • M
      fadump: Invalidate the fadump registration during machine shutdown. · 67b43b9d
      Mahesh Salgaonkar 提交于
      If dump is active during system reboot, shutdown or halt then invalidate
      the fadump registration as it does not get invalidated automatically.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      67b43b9d
    • M
      fadump: Invalidate registration and release reserved memory for general use. · b500afff
      Mahesh Salgaonkar 提交于
      This patch introduces an sysfs interface '/sys/kernel/fadump_release_mem' to
      invalidate the last fadump registration, invalidate '/proc/vmcore', release
      the reserved memory for general use and re-register for future kernel dump.
      Once the dump is copied to the disk, unlike phyp dump, the userspace tool
      can release all the memory reserved for dump with one single operation of
      echo 1 to '/sys/kernel/fadump_release_mem'.
      
      Release the reserved memory region excluding the size of the memory required
      for future kernel dump registration. And therefore, unlike kdump, Fadump
      doesn't need a 2nd reboot to get back the system to the production
      configuration.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b500afff
    • M
      fadump: Add PT_NOTE program header for vmcoreinfo · d34c5f26
      Mahesh Salgaonkar 提交于
      Introduce a PT_NOTE program header that points to physical address of
      vmcoreinfo_note buffer declared in kernel/kexec.c. The vmcoreinfo
      note buffer is populated during crash_fadump() at the time of system
      crash.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d34c5f26
    • M
      fadump: Convert firmware-assisted cpu state dump data into elf notes. · ebaeb5ae
      Mahesh Salgaonkar 提交于
      When registered for firmware assisted dump on powerpc, firmware preserves
      the registers for the active CPUs during a system crash. This patch reads
      the cpu register data stored in Firmware-assisted dump format (except for
      crashing cpu) and converts it into elf notes and updates the PT_NOTE program
      header accordingly. The exact register state for crashing cpu is saved to
      fadump crash info structure in scratch area during crash_fadump() and read
      during second kernel boot.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ebaeb5ae
    • M
      fadump: Initialize elfcore header and add PT_LOAD program headers. · 2df173d9
      Mahesh Salgaonkar 提交于
      Build the crash memory range list by traversing through system memory during
      the first kernel before we register for firmware-assisted dump. After the
      successful dump registration, initialize the elfcore header and populate
      PT_LOAD program headers with crash memory ranges. The elfcore header is
      saved in the scratch area within the reserved memory. The scratch area starts
      at the end of the memory reserved for saving RMR region contents. The
      scratch area contains fadump crash info structure that contains magic number
      for fadump validation and physical address where the eflcore header can be
      found. This structure will also be used to pass some important crash info
      data to the second kernel which will help second kernel to populate ELF core
      header with correct data before it gets exported through /proc/vmcore. Since
      the firmware preserves the entire partition memory at the time of crash the
      contents of the scratch area will be preserved till second kernel boot.
      
      Since the memory dump exported through /proc/vmcore is in ELF format similar
      to kdump, it will help us to reuse the kdump infrastructure for dump capture
      and filtering. Unlike phyp dump, userspace tool does not need to refer any
      sysfs interface while reading /proc/vmcore.
      
      NOTE: The current design implementation does not address a possibility of
      introducing additional fields (in future) to this structure without affecting
      compatibility. It's on TODO list to come up with better approach to
      address this.
      
      Reserved dump area start => +-------------------------------------+
                                  |  CPU state dump data                |
                                  +-------------------------------------+
                                  |  HPTE region data                   |
                                  +-------------------------------------+
                                  |  RMR region data                    |
      Scratch area start       => +-------------------------------------+
                                  |  fadump crash info structure {      |
                                  |     magic nummber                   |
                           +------|---- elfcorehdr_addr                 |
                           |      |  }                                  |
                           +----> +-------------------------------------+
                                  |  ELF core header                    |
      Reserved dump area end   => +-------------------------------------+
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2df173d9
    • M
      fadump: Register for firmware assisted dump. · 3ccc00a7
      Mahesh Salgaonkar 提交于
      On 2012-02-20 11:02:51 Mon, Paul Mackerras wrote:
      > On Thu, Feb 16, 2012 at 04:44:30PM +0530, Mahesh J Salgaonkar wrote:
      >
      > If I have read the code correctly, we are going to get this printk on
      > non-pSeries machines or on older pSeries machines, even if the user
      > has not put the fadump=on option on the kernel command line.  The
      > printk will be annoying since there is no actual error condition.  It
      > seems to me that the condition for the printk should include
      > fw_dump.fadump_enabled.  In other words you should probably add
      >
      > 	if (!fw_dump.fadump_enabled)
      > 		return 0;
      >
      > at the beginning of the function.
      
      Hi Paul,
      
      Thanks for pointing it out. Please find the updated patch below.
      
      The existing patches above this (4/10 through 10/10) cleanly applies
      on this update.
      
      Thanks,
      -Mahesh.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3ccc00a7
    • M
      fadump: Reserve the memory for firmware assisted dump. · eb39c880
      Mahesh Salgaonkar 提交于
      Reserve the memory during early boot to preserve CPU state data, HPTE region
      and RMA (real mode area) region data in case of kernel crash. At the time of
      crash, powerpc firmware will store CPU state data, HPTE region data and move
      RMA region data to the reserved memory area.
      
      If the firmware-assisted dump fails to reserve the memory, then fallback
      to existing kexec-based kdump.
      
      Most of the code implementation to reserve memory has been
      adapted from phyp assisted dump implementation written by Linas Vepstas
      and Manish Ahuja
      
      This patch also introduces a config option CONFIG_FA_DUMP for firmware
      assisted dump feature on Powerpc (ppc64) architecture.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      eb39c880
    • K
      powerpc/mpic: Remove duplicate MPIC_WANTS_RESET flag · e55d7f73
      Kyle Moffett 提交于
      There are two separate flags controlling whether or not the MPIC is
      reset during initialization, which is completely unnecessary, and only
      one of them can be specified in the device tree.
      
      Also, most platforms in-tree right now do actually want to reset the
      MPIC during initialization anyways, which means lots of duplicate code
      passing the MPIC_WANTS_RESET flag.
      
      Fix all of the callers which currently do not pass the MPIC_WANTS_RESET
      flag to pass the MPIC_NO_RESET flag, then remove the MPIC_WANTS_RESET
      flag and make the code reset the MPIC by default.
      Signed-off-by: NKyle Moffett <Kyle.D.Moffett@boeing.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e55d7f73
    • K
      powerpc/mpic: Add "last-interrupt-source" property to override hardware · c1b8d45d
      Kyle Moffett 提交于
      The FreeScale PowerQUICC-III-compatible (mpc85xx/mpc86xx) MPICs do not
      correctly report the number of hardware interrupt sources, so software
      needs to override the detected value with "256".
      
      To avoid needing to write custom board-specific code to detect that
      scenario, allow it to be easily overridden in the device-tree.
      Signed-off-by: NKyle Moffett <Kyle.D.Moffett@boeing.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c1b8d45d
    • K
      powerpc/mpic: Remove MPIC_BROKEN_FRR_NIRQS and duplicate irq_count · 5019609f
      Kyle Moffett 提交于
      The mpic->irq_count variable is only used as a software error-checking
      limit to determine whether or not an IRQ number is valid.  In board code
      which does not manually specify an IRQ count to mpic_alloc(), i.e. 0, it
      is automatically detected from the number of ISUs and the ISU size.
      
      In practice, all hardware ends up with irq_count == num_sources, so all
      of the runtime checks on mpic->irq_count should just check the value of
      mpic->num_sources instead.
      
      When platform hardware does not correctly report the number of IRQs,
      which only happens on the MPC85xx/MPC86xx, the MPIC_BROKEN_FRR_NIRQS
      flag is used to override the detected value of num_sources with the
      manual irq_count parameter.  Since there's no need to manually specify
      the number of IRQs except in this case, the extra flag can be eliminated
      and the test changed to "irq_count != 0".
      Signed-off-by: NKyle Moffett <Kyle.D.Moffett@boeing.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      5019609f
    • K
      fsl/mpic: Create and document the "single-cpu-affinity" device-tree flag · 9ca163c8
      Kyle Moffett 提交于
      The Freescale MPIC (and perhaps others in the future) is incapable of
      routing non-IPI interrupts to more than once CPU at a time.  Currently
      all of the Freescale boards msut pass the MPIC_SINGLE_DEST_CPU flag to
      mpic_alloc(), but that information should really be present in the
      device-tree.
      
      Older board code can't rely on the device-tree having the property set,
      but newer platforms won't need it manually specified in the code.
      
      [BenH: Remove unrelated changes, folded in a different patch]
      Signed-off-by: NKyle Moffett <Kyle.D.Moffett@boeing.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9ca163c8
    • K
      fsl/mpic: Document and use the "big-endian" device-tree flag · 98cca250
      Kyle Moffett 提交于
      The MPIC code checks for a "big-endian" property and sets the flag
      MPIC_BIG_ENDIAN if one is present, although prior to the "mpic->flags"
      fixup that would never have worked anways.
      
      Unfortunately, even now that it works properly, the Freescale mpic
      device-node (the "PowerQUICC-III"-compatible one) does not specify it,
      so all of the board ports need to manually pass it to mpic_alloc().
      
      Document the flag and add it to the pq3 device tree.  Existing code will
      still need to pass the MPIC_BIG_ENDIAN flag because their dtb may not
      have this property, but new platforms shouldn't need to do so.
      Signed-off-by: NKyle Moffett <Kyle.D.Moffett@boeing.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      98cca250
    • K
      powerpc/mpic: Fix use of "flags" variable in mpic_alloc() · 3a7a7176
      Kyle Moffett 提交于
      The mpic_alloc() function takes a "flags" parameter and assigns it into
      the mpic->flags variable fairly early on, but several later pieces of
      code detect various device-tree properties and save them into the
      "mpic->flags" variable (EG: "big-endian" => MPIC_BIG_ENDIAN).
      
      Unfortunately, a number of codepaths (including several which test the
      flag MPIC_BIG_ENDIAN!) test "flags" instead of "mpic->flags", and get
      wrong answers as a result.
      
      Consolidate the device-tree flag tests early in mpic_alloc() and change
      all of the checks after "mpic->flags" is init'ed to use "mpic->flags".
      
      [BenH: Fixed up use of mpic->node before it's initialized]
      Signed-off-by: NKyle Moffett <Kyle.D.Moffett@boeing.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3a7a7176
  5. 22 2月, 2012 3 次提交
  6. 16 2月, 2012 5 次提交
    • A
      powerpc/perf: power_pmu_start restores incorrect values, breaking frequency events · 9a45a940
      Anton Blanchard 提交于
      perf on POWER stopped working after commit e050e3f0 (perf: Fix
      broken interrupt rate throttling). That patch exposed a bug in
      the POWER perf_events code.
      
      Since the PMCs count upwards and take an exception when the top bit
      is set, we want to write 0x80000000 - left in power_pmu_start. We were
      instead programming in left which effectively disables the counter
      until we eventually hit 0x80000000. This could take seconds or longer.
      
      With the patch applied I get the expected number of samples:
      
                SAMPLE events:       9948
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: <stable@kernel.org>
      9a45a940
    • B
      powerpc: Disable interrupts early in Program Check · 54321242
      Benjamin Herrenschmidt 提交于
      Program Check exceptions are the result of WARNs, BUGs, some
      type of breakpoints, kprobe, and other illegal instructions.
      
      We want interrupts (and thus preemption) to remain disabled
      while doing the initial stage of testing the reason and
      branching off to a debugger or kprobe, so we are still on
      the original CPU which makes debugging easier in various cases.
      
      This is how the code was intended, hence the local_irq_enable()
      right in the middle of program_check_exception().
      
      However, the assembly exception prologue for that exception was
      incorrectly marked as enabling interrupts, which defeats that
      (and records a redundant enable with lockdep).
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      54321242
    • S
      powerpc: Remove legacy iSeries from ppc64_defconfig · a1a1d1bf
      Stephen Rothwell 提交于
      Since we are heading towards removing the Legacy iSeries platform, start
      by no longer building it for ppc64_defconfig.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a1a1d1bf
    • B
      powerpc/fsl/pci: Fix PCIe fixup regression · 13635dfd
      Benjamin Herrenschmidt 提交于
      Upstream changes to the way PHB resources are registered
      broke the resource fixup for FSL boards.
      
      We can no longer rely on the resource pointer array for the PHB's
      pci_bus structure, so let's leave it alone and go straight for
      the PHB resources instead. This also makes the code generally
      more readable.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      13635dfd
    • I
      powerpc: Fix kernel log of oops/panic instruction dump · 40c8cefa
      Ira Snyder 提交于
      A kernel oops/panic prints an instruction dump showing several
      instructions before and after the instruction which caused the
      oops/panic.
      
      The code intended that the faulting instruction be enclosed in angle
      brackets, however a bug caused the faulting instruction to be
      interpreted by printk() as the message log level.
      
      To fix this, the KERN_CONT log level is added before the actual text of
      the printed message.
      
      === Before the patch ===
      
      [ 1081.587266] Instruction dump:
      [ 1081.590236] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
      [ 1081.598034] 3d20c03a 9009a114 7c0004ac 39200000
      [ 1081.602500]  4e800020 3803ffd0 2b800009
      
      <4>[ 1081.587266] Instruction dump:
      <4>[ 1081.590236] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
      <4>[ 1081.598034] 3d20c03a 9009a114 7c0004ac 39200000
      <98090000>[ 1081.602500]  4e800020 3803ffd0 2b800009
      
      === After the patch ===
      
      [   51.385216] Instruction dump:
      [   51.388186] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
      [   51.395986] 3d20c03a 9009a114 7c0004ac 39200000 <98090000> 4e800020 3803ffd0 2b800009
      
      <4>[   51.385216] Instruction dump:
      <4>[   51.388186] 7c000110 7c0000f8 5400077c 552907f6 7d290378 992b0003 4e800020 38000001
      <4>[   51.395986] 3d20c03a 9009a114 7c0004ac 39200000 <98090000> 4e800020 3803ffd0 2b800009
      Signed-off-by: NIra W. Snyder <iws@ovro.caltech.edu>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      40c8cefa
  7. 14 2月, 2012 3 次提交
    • T
      powerpc/pseries/eeh: Fix crash when error happens during device probe · 778a785f
      Thadeu Lima de Souza Cascardo 提交于
      EEH may happen during a PCI driver probe. If the driver is trying to
      access some register in a loop, the EEH code will try to print the
      driver name. But the driver pointer in struct pci_dev is not set until
      probe returns successfully.
      
      Use a function to test if the device and the driver pointer is NULL
      before accessing the driver's name.
      Signed-off-by: NThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      778a785f
    • B
      powerpc/pseries: Fix partition migration hang in stop_topology_update · 444080d1
      Brian King 提交于
      This fixes a hang that was observed during live partition migration.
      Since stop_topology_update must not be called from an interrupt
      context, call it earlier in the migration process. The hang observed
      can be seen below:
      
      WARNING: at kernel/timer.c:1011
      Modules linked in: ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables ipv6 fuse loop ibmveth sg ext3 jbd mbcache raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid10 raid1 raid0 scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc dm_round_robin dm_multipath scsi_dh sd_mod crc_t10dif ibmvfc scsi_transport_fc scsi_tgt scsi_mod dm_snapshot dm_mod
      NIP: c0000000000c52d8 LR: c00000000004be28 CTR: 0000000000000000
      REGS: c00000005ffd77d0 TRAP: 0700   Not tainted  (3.2.0-git-00001-g07d106d0)
      MSR: 8000000000021032 <ME,CE,IR,DR>  CR: 48000084  XER: 00000001
      CFAR: c00000000004be20
      TASK = c00000005ec78860[0] 'swapper/3' THREAD: c00000005ec98000 CPU: 3
      GPR00: 0000000000000001 c00000005ffd7a50 c000000000fbbc98 c000000000ec8340
      GPR04: 00000000282a0020 0000000000000000 0000000000004000 0000000000000101
      GPR08: 0000000000000012 c00000005ffd4000 0000000000000020 c000000000f3ba88
      GPR12: 0000000000000000 c000000007f40900 0000000000000001 0000000000000004
      GPR16: 0000000000000001 0000000000000000 0000000000000000 c000000001022310
      GPR20: 0000000000000001 0000000000000000 0000000000200200 c000000001029e14
      GPR24: 0000000000000000 0000000000000001 0000000000000040 c00000003f74bc80
      GPR28: c00000003f74bc84 c000000000f38038 c000000000f16b58 c000000000ec8340
      NIP [c0000000000c52d8] .del_timer_sync+0x28/0x60
      LR [c00000000004be28] .stop_topology_update+0x20/0x38
      Call Trace:
      [c00000005ffd7a50] [c00000005ec78860] 0xc00000005ec78860 (unreliable)
      [c00000005ffd7ad0] [c00000000004be28] .stop_topology_update+0x20/0x38
      [c00000005ffd7b40] [c000000000028378] .__rtas_suspend_last_cpu+0x58/0x260
      [c00000005ffd7bf0] [c0000000000fa230] .generic_smp_call_function_interrupt+0x160/0x358
      [c00000005ffd7cf0] [c000000000036ec8] .smp_ipi_demux+0x88/0x100
      [c00000005ffd7d80] [c00000000005c154] .icp_hv_ipi_action+0x5c/0x80
      [c00000005ffd7e00] [c00000000012a088] .handle_irq_event_percpu+0x100/0x318
      [c00000005ffd7f00] [c00000000012e774] .handle_percpu_irq+0x84/0xd0
      [c00000005ffd7f90] [c000000000022ba8] .call_handle_irq+0x1c/0x2c
      [c00000005ec9ba20] [c00000000001157c] .do_IRQ+0x22c/0x2a8
      [c00000005ec9bae0] [c0000000000054bc] hardware_interrupt_entry+0x18/0x1c
      Exception: 501 at .cpu_idle+0x194/0x2f8
          LR = .cpu_idle+0x194/0x2f8
      [c00000005ec9bdd0] [c000000000017e58] .cpu_idle+0x188/0x2f8 (unreliable)
      [c00000005ec9be90] [c00000000067ec18] .start_secondary+0x3e4/0x524
      [c00000005ec9bf90] [c0000000000093e8] .start_secondary_prolog+0x10/0x14
      Instruction dump:
      ebe1fff8 4e800020 fbe1fff8 7c0802a6 f8010010 7c7f1b78 f821ff81 78290464
      80090014 5400019e 7c0000d0 78000fe0 <0b000000> 4800000c 7c210b78 7c421378
      Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      444080d1
    • M
      powerpc/powernv: Disable interrupts while taking phb->lock · f1c853b5
      Michael Ellerman 提交于
      We need to disable interrupts when taking the phb->lock. Otherwise
      we could deadlock with pci_lock taken from an interrupt.
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      f1c853b5