1. 20 3月, 2014 18 次提交
  2. 07 3月, 2014 17 次提交
    • S
      powerpc/powernv Platform dump interface · c7e64b9c
      Stewart Smith 提交于
      This enables support for userspace to fetch and initiate FSP and
      Platform dumps from the service processor (via firmware) through sysfs.
      
      Based on original patch from Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
      
      Flow:
        - We register for OPAL notification events.
        - OPAL sends new dump available notification.
        - We make information on dump available via sysfs
        - Userspace requests dump contents
        - We retrieve the dump via OPAL interface
        - User copies the dump data
        - userspace sends ack for dump
        - We send ACK to OPAL.
      
      sysfs files:
        - We add the /sys/firmware/opal/dump directory
        - echoing 1 (well, anything, but in future we may support
          different dump types) to /sys/firmware/opal/dump/initiate_dump
          will initiate a dump.
        - Each dump that we've been notified of gets a directory
          in /sys/firmware/opal/dump/ with a name of the dump type and ID (in hex,
          as this is what's used elsewhere to identify the dump).
        - Each dump has files: id, type, dump and acknowledge
          dump is binary and is the dump itself.
          echoing 'ack' to acknowledge (currently any string will do) will
          acknowledge the dump and it will soon after disappear from sysfs.
      
      OPAL APIs:
        - opal_dump_init()
        - opal_dump_info()
        - opal_dump_read()
        - opal_dump_ack()
        - opal_dump_resend_notification()
      
      Currently we are only ever notified for one dump at a time (until
      the user explicitly acks the current dump, then we get a notification
      of the next dump), but this kernel code should "just work" when OPAL
      starts notifying us of all the dumps present.
      Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c7e64b9c
    • S
      powerpc/powernv: Read OPAL error log and export it through sysfs · 774fea1a
      Stewart Smith 提交于
      Based on a patch by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      
      This patch adds support to read error logs from OPAL and export
      them to userspace through a sysfs interface.
      
      We export each log entry as a directory in /sys/firmware/opal/elog/
      
      Currently, OPAL will buffer up to 128 error log records, we don't
      need to have any knowledge of this limit on the Linux side as that
      is actually largely transparent to us.
      
      Each error log entry has the following files: id, type, acknowledge, raw.
      Currently we just export the raw binary error log in the 'raw' attribute.
      In a future patch, we may parse more of the error log to make it a bit
      easier for userspace (e.g. to be able to display a brief summary in
      petitboot without having to have a full parser).
      
      If we have >128 logs from OPAL, we'll only be notified of 128 until
      userspace starts acknowledging them. This limitation may be lifted in
      the future and with this patch, that should "just work" from the linux side.
      
      A userspace daemon should:
      - wait for error log entries using normal mechanisms (we announce creation)
      - read error log entry
      - save error log entry safely to disk
      - acknowledge the error log entry
      - rinse, repeat.
      
      On the Linux side, we read the error log when we're notified of it. This
      possibly isn't ideal as it would be better to only read them on-demand.
      However, this doesn't really work with current OPAL interface, so we
      read the error log immediately when notified at the moment.
      
      I've tested this pretty extensively and am rather confident that the
      linux side of things works rather well. There is currently an issue with
      the service processor side of things for >128 error logs though.
      Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      774fea1a
    • B
      powerpc/book3e: Fix check for linear mapping in TLB miss handler · 60b96223
      Benjamin Krill 提交于
      The previous code added wrong TLBs and causes machine check errors if
      a driver accessed passed the end of the linear mapping instead of
      a clean page fault.
      Signed-off-by: NRalph E. Bellofatto <ralphbel@us.ibm.com>
      Signed-off-by: NBenjamin Krill <ben@codiert.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      60b96223
    • S
      powerpc: Add "force config cmd line" Kconfig option · eb3b80f6
      Sebastian Siewior 提交于
      powerpc uses early_init_dt_scan_chosen() from common fdt code. By
      enabling this option, the common code can take the built in
      command line over the one that is comming from bootloader / DT.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      eb3b80f6
    • T
      powerpc/pseries: Expose in kernel device tree update to drmgr · 9da34892
      Tyrel Datwyler 提交于
      Traditionally it has been drmgr's responsibilty to update the device tree
      through the /proc/ppc64/ofdt interface after a suspend/resume operation.
      This patchset however has modified suspend/resume ops to preform an update
      entirely in the kernel during the resume. Therefore, a mechanism is required
      to expose that information to drmgr.
      
      This patch adds a show function to the "hibernate" attribute that returns 1
      if the kernel performs a device tree update after the resume and 0 otherwise.
      This allows newer versions of drmgr to avoid doing a second unnecessary
      device tree update.
      Signed-off-by: NTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9da34892
    • H
      powerpc/pseries: Update dynamic cache nodes for suspend/resume operation · 6b36ba84
      Haren Myneni 提交于
      pHyp can change cache nodes for suspend/resume operation. Currently the
      device tree is updated by drmgr in userspace after all non boot CPUs are
      enabled. Hence, we do not modify the cache list based on the latest cache
      nodes. Also we do not remove cache entries for the primary CPU.
      
      This patch removes the cache list for the boot CPU, updates the device tree
      before enabling nonboot CPUs and adds cache list for the boot cpu.
      
      This patch also has the side effect that older versions of drmgr will
      perform a second device tree update from userspace. While this is a
      redundant waste of a couple cycles it is harmless since firmware returns the
      same data for the subsequent update-nodes/properties rtas calls.
      Signed-off-by: NHaren Myneni <hbabu@us.ibm.com>
      Signed-off-by: NTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      6b36ba84
    • H
      powerpc/pseries: Device tree should only be updated once after suspend/migrate · 39a33b59
      Haren Myneni 提交于
      The current code makes rtas calls for update-nodes, activate-firmware and then
      update-nodes again. The FW provides the same data for both update-nodes calls.
      As a result a proc entry exists error is reported for the second update while
      adding device nodes.
      
      This patch makes a single rtas call for update-nodes after activating the FW.
      It also add rtas_busy delay for the activate-firmware rtas call.
      Signed-off-by: NHaren Myneni <hbabu@us.ibm.com>
      Signed-off-by: NTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      39a33b59
    • P
      powerpc: Delete old PrPMC 280/2800 support · 3c8464a9
      Paul Gortmaker 提交于
      This processor/memory module was mostly used on ATCA blades and
      before that, on cPCI blades.  It wasn't really user friendly, with
      custom non u-boot bootloaders (powerboot/motload) and no real way
      to recover corrupted boot flash (which was a common problem).
      
      As such, it had its day back before the big ppc --> powerpc move
      to device trees, and that was largely through commercial BSPs that
      started to dry up around 2007.
      
      Systems using one were largely in a "deploy and sustain" mode,
      so interest in upgrading to new kernels in the field was nil.
      Also, requiring 50A, 48V power supplies and a 2'x2'x2' ATCA
      chassis largely rules out any hobbyist/enthusiast interest.
      
      The point of all this, is that we might as well delete the in
      kernel files relating to this platform.  No point in continuing
      to build it via walking the defconfigs or via linux-next testing.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      3c8464a9
    • N
      powerpc/pseries: Use remove_memory() to remove memory · 9ac8cde9
      Nathan Fontenot 提交于
      The memory remove code for powerpc/pseries should call remove_memory()
      so that we are holding the hotplug_memory lock during memory remove
      operations.
      
      This patch updates the memory node remove handler to call remove_memory()
      and adds a ppc_md.remove_memory() entry to handle pseries specific work
      that is called from arch_remove_memory().
      
      During memory remove in pseries_remove_memblock() we have to stay with
      removing memory one section at a time. This is needed because of how memory
      resources are handled. During memory add for pseries (via the probe file in
      sysfs) we add memory one section at a time which gives us a memory resource
      for each section. Future patches will aim to address this so will not have
      to remove memory one section at a time.
      Signed-off-by: NNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      9ac8cde9
    • M
      selftests/powerpc: Import Anton's memcpy / copy_tofrom_user tests · 22d651dc
      Michael Ellerman 提交于
      Turn Anton's memcpy / copy_tofrom_user test into something that can
      live in tools/testing/selftests.
      
      It requires one turd in arch/powerpc/lib/memcpy_64.S, but it's pretty
      harmless IMHO.
      
      We are sailing very close to the wind with the feature macros. We define
      them to nothing, which currently means we get a few extra nops and
      include the unaligned calls.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      22d651dc
    • M
      powerpc/book3s: Recover from MC in sapphire on SCOM read via MMIO. · 55672ecf
      Mahesh Salgaonkar 提交于
      Detect and recover from machine check when inside opal on a special
      scom load instructions. On specific SCOM read via MMIO we may get a machine
      check exception with SRR0 pointing inside opal. To recover from MC
      in this scenario, get a recovery instruction address and return to it from
      MC.
      
      OPAL will export the machine check recoverable ranges through
      device tree node mcheck-recoverable-ranges under ibm,opal:
      
      # hexdump /proc/device-tree/ibm,opal/mcheck-recoverable-ranges
      0000000 0000 0000 3000 2804 0000 000c 0000 0000
      0000010 3000 2814 0000 0000 3000 27f0 0000 000c
      0000020 0000 0000 3000 2814 xxxx xxxx xxxx xxxx
      0000030 llll llll yyyy yyyy yyyy yyyy
      ...
      ...
      #
      
      where:
      	xxxx xxxx xxxx xxxx = Starting instruction address
      	llll llll           = Length of the address range.
      	yyyy yyyy yyyy yyyy = recovery address
      
      Each recoverable address range entry is (start address, len,
      recovery address), 2 cells each for start and recovery address, 1 cell for
      len, totalling 5 cells per entry. During kernel boot time, build up the
      recovery table with the list of recovery ranges from device-tree node which
      will be used during machine check exception to recover from MMIO SCOM UE.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      55672ecf
    • B
      powerpc/pseries: Don't try to register pseries cpu hotplug on non-pseries · d2a36071
      Benjamin Herrenschmidt 提交于
      This results in oddball messages at boot on other platforms telling us
      that CPU hotplug isn't supported even when it is.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      d2a36071
    • P
      powerpc: Fix xmon disassembler for little-endian · 72eceef6
      Philippe Bergheaud 提交于
      This patch fixes the disassembler of the powerpc kernel debugger xmon,
      for little-endian.
      Signed-off-by: NPhilippe Bergheaud <felix@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      72eceef6
    • L
      powerpc: Revert c6102609 and replace it with the correct fix for vio dma mask setting · 10862a0c
      Li Zhong 提交于
      This patch reverts my previous "fix", and replace it with the correct
      fix from Russell.
      
      And as Russell pointed out -- dma_set_mask_and_coherent() (and the other
      dma_set_mask() functions) are really supposed to be used by drivers
      only.
      Signed-off-by: NLi Zhong <zhong@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      10862a0c
    • powerpc: : Kill CONFIG_MTD_PARTITIONS · 84744377
      송은봉 提交于
      This patch removes CONFIG_MTD_PARTITIONS in config files for powerpc.
       Because CONFIG_MTD_PARTITIONS was removed by commit 6a8a98b2.
      Signed-off-by: NEunbong Song <eunb.song@samsung.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      84744377
    • A
      powerpc: Align p_dyn, p_rela and p_st symbols · a5b2cf5b
      Anton Blanchard 提交于
      The 64bit relocation code places a few symbols in the text segment.
      These symbols are only 4 byte aligned where they need to be 8 byte
      aligned. Add an explicit alignment.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Cc: stable@vger.kernel.org
      Tested-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a5b2cf5b
    • M
      powerpc/tm: Fix crash when forking inside a transaction · 621b5060
      Michael Neuling 提交于
      When we fork/clone we currently don't copy any of the TM state to the new
      thread.  This results in a TM bad thing (program check) when the new process is
      switched in as the kernel does a tmrechkpt with TEXASR FS not set.  Also, since
      R1 is from userspace, we trigger the bad kernel stack pointer detection.  So we
      end up with something like this:
      
         Bad kernel stack pointer 0 at c0000000000404fc
         cpu 0x2: Vector: 700 (Program Check) at [c00000003ffefd40]
             pc: c0000000000404fc: restore_gprs+0xc0/0x148
             lr: 0000000000000000
             sp: 0
            msr: 9000000100201030
           current = 0xc000001dd1417c30
           paca    = 0xc00000000fe00800   softe: 0        irq_happened: 0x01
             pid   = 0, comm = swapper/2
         WARNING: exception is not recoverable, can't continue
      
      The below fixes this by flushing the TM state before we copy the task_struct to
      the clone.  To do this we go through the tmreclaim patch, which removes the
      checkpointed registers from the CPU and transitions the CPU out of TM suspend
      mode.  Hence we need to call tmrechkpt after to restore the checkpointed state
      and the TM mode for the current task.
      
      To make this fail from userspace is simply:
      	tbegin
      	li	r0, 2
      	sc
      	<boom>
      
      Kudos to Adhemerval Zanella Neto for finding this.
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      cc: Adhemerval Zanella Neto <azanella@br.ibm.com>
      cc: stable@vger.kernel.org
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      621b5060
  3. 28 2月, 2014 5 次提交
    • B
      powerpc/powernv: Fix indirect XSCOM unmangling · e0cf9576
      Benjamin Herrenschmidt 提交于
      We need to unmangle the full address, not just the register
      number, and we also need to support the real indirect bit
      being set for in-kernel uses.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@vger.kernel.org> [v3.13]
      e0cf9576
    • B
      powerpc/powernv: Fix opal_xscom_{read,write} prototype · 2f3f38e4
      Benjamin Herrenschmidt 提交于
      The OPAL firmware functions opal_xscom_read and opal_xscom_write
      take a 64-bit argument for the XSCOM (PCB) address in order to
      support the indirect mode on P8.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: <stable@vger.kernel.org> [v3.13]
      2f3f38e4
    • G
      powerpc/powernv: Refactor PHB diag-data dump · af87d2fe
      Gavin Shan 提交于
      As Ben suggested, the patch prints PHB diag-data with multiple
      fields in one line and omits the line if the fields of that
      line are all zero.
      
      With the patch applied, the PHB3 diag-data dump looks like:
      
      PHB3 PHB#3 Diag-data (Version: 1)
      
        brdgCtl:     00000002
        RootSts:     0000000f 00400000 b0830008 00100147 00002000
        nFir:        0000000000000000 0030006e00000000 0000000000000000
        PhbSts:      0000001c00000000 0000000000000000
        Lem:         0000000000100000 42498e327f502eae 0000000000000000
        InAErr:      8000000000000000 8000000000000000 0402030000000000 0000000000000000
        PE[  8] A/B: 8480002b00000000 8000000000000000
      
      [ The current diag data is so big that it overflows the printk
        buffer pretty quickly in cases when we get a handful of errors
        at once which can happen. --BenH
      ]
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      af87d2fe
    • G
      powerpc/powernv: Dump PHB diag-data immediately · 94716604
      Gavin Shan 提交于
      The PHB diag-data is important to help locating the root cause for
      EEH errors such as frozen PE or fenced PHB. However, the EEH core
      enables IO path by clearing part of HW registers before collecting
      this data causing it to be corrupted.
      
      This patch fixes this by dumping the PHB diag-data immediately when
      frozen/fenced state on PE or PHB is detected for the first time in
      eeh_ops::get_state() or next_error() backend.
      Signed-off-by: NGavin Shan <shangw@linux.vnet.ibm.com>
      CC: <stable@vger.kernel.org>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      94716604
    • P
      powerpc: Increase stack redzone for 64-bit userspace to 512 bytes · 573ebfa6
      Paul Mackerras 提交于
      The new ELFv2 little-endian ABI increases the stack redzone -- the
      area below the stack pointer that can be used for storing data --
      from 288 bytes to 512 bytes.  This means that we need to allow more
      space on the user stack when delivering a signal to a 64-bit process.
      
      To make the code a bit clearer, we define new USER_REDZONE_SIZE and
      KERNEL_REDZONE_SIZE symbols in ptrace.h.  For now, we leave the
      kernel redzone size at 288 bytes, since increasing it to 512 bytes
      would increase the size of interrupt stack frames correspondingly.
      
      Gcc currently only makes use of 288 bytes of redzone even when
      compiling for the new little-endian ABI, and the kernel cannot
      currently be compiled with the new ABI anyway.
      
      In the future, hopefully gcc will provide an option to control the
      amount of redzone used, and then we could reduce it even more.
      
      This also changes the code in arch_compat_alloc_user_space() to
      preserve the expanded redzone.  It is not clear why this function would
      ever be used on a 64-bit process, though.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      CC: <stable@vger.kernel.org> [v3.13]
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      573ebfa6