1. 28 4月, 2014 1 次提交
  2. 09 4月, 2014 3 次提交
  3. 07 4月, 2014 2 次提交
  4. 24 3月, 2014 1 次提交
  5. 07 3月, 2014 3 次提交
    • S
      powerpc/powernv Platform dump interface · c7e64b9c
      Stewart Smith 提交于
      This enables support for userspace to fetch and initiate FSP and
      Platform dumps from the service processor (via firmware) through sysfs.
      
      Based on original patch from Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
      
      Flow:
        - We register for OPAL notification events.
        - OPAL sends new dump available notification.
        - We make information on dump available via sysfs
        - Userspace requests dump contents
        - We retrieve the dump via OPAL interface
        - User copies the dump data
        - userspace sends ack for dump
        - We send ACK to OPAL.
      
      sysfs files:
        - We add the /sys/firmware/opal/dump directory
        - echoing 1 (well, anything, but in future we may support
          different dump types) to /sys/firmware/opal/dump/initiate_dump
          will initiate a dump.
        - Each dump that we've been notified of gets a directory
          in /sys/firmware/opal/dump/ with a name of the dump type and ID (in hex,
          as this is what's used elsewhere to identify the dump).
        - Each dump has files: id, type, dump and acknowledge
          dump is binary and is the dump itself.
          echoing 'ack' to acknowledge (currently any string will do) will
          acknowledge the dump and it will soon after disappear from sysfs.
      
      OPAL APIs:
        - opal_dump_init()
        - opal_dump_info()
        - opal_dump_read()
        - opal_dump_ack()
        - opal_dump_resend_notification()
      
      Currently we are only ever notified for one dump at a time (until
      the user explicitly acks the current dump, then we get a notification
      of the next dump), but this kernel code should "just work" when OPAL
      starts notifying us of all the dumps present.
      Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c7e64b9c
    • S
      powerpc/powernv: Read OPAL error log and export it through sysfs · 774fea1a
      Stewart Smith 提交于
      Based on a patch by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      
      This patch adds support to read error logs from OPAL and export
      them to userspace through a sysfs interface.
      
      We export each log entry as a directory in /sys/firmware/opal/elog/
      
      Currently, OPAL will buffer up to 128 error log records, we don't
      need to have any knowledge of this limit on the Linux side as that
      is actually largely transparent to us.
      
      Each error log entry has the following files: id, type, acknowledge, raw.
      Currently we just export the raw binary error log in the 'raw' attribute.
      In a future patch, we may parse more of the error log to make it a bit
      easier for userspace (e.g. to be able to display a brief summary in
      petitboot without having to have a full parser).
      
      If we have >128 logs from OPAL, we'll only be notified of 128 until
      userspace starts acknowledging them. This limitation may be lifted in
      the future and with this patch, that should "just work" from the linux side.
      
      A userspace daemon should:
      - wait for error log entries using normal mechanisms (we announce creation)
      - read error log entry
      - save error log entry safely to disk
      - acknowledge the error log entry
      - rinse, repeat.
      
      On the Linux side, we read the error log when we're notified of it. This
      possibly isn't ideal as it would be better to only read them on-demand.
      However, this doesn't really work with current OPAL interface, so we
      read the error log immediately when notified at the moment.
      
      I've tested this pretty extensively and am rather confident that the
      linux side of things works rather well. There is currently an issue with
      the service processor side of things for >128 error logs though.
      Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      774fea1a
    • M
      powerpc/book3s: Recover from MC in sapphire on SCOM read via MMIO. · 55672ecf
      Mahesh Salgaonkar 提交于
      Detect and recover from machine check when inside opal on a special
      scom load instructions. On specific SCOM read via MMIO we may get a machine
      check exception with SRR0 pointing inside opal. To recover from MC
      in this scenario, get a recovery instruction address and return to it from
      MC.
      
      OPAL will export the machine check recoverable ranges through
      device tree node mcheck-recoverable-ranges under ibm,opal:
      
      # hexdump /proc/device-tree/ibm,opal/mcheck-recoverable-ranges
      0000000 0000 0000 3000 2804 0000 000c 0000 0000
      0000010 3000 2814 0000 0000 3000 27f0 0000 000c
      0000020 0000 0000 3000 2814 xxxx xxxx xxxx xxxx
      0000030 llll llll yyyy yyyy yyyy yyyy
      ...
      ...
      #
      
      where:
      	xxxx xxxx xxxx xxxx = Starting instruction address
      	llll llll           = Length of the address range.
      	yyyy yyyy yyyy yyyy = recovery address
      
      Each recoverable address range entry is (start address, len,
      recovery address), 2 cells each for start and recovery address, 1 cell for
      len, totalling 5 cells per entry. During kernel boot time, build up the
      recovery table with the list of recovery ranges from device-tree node which
      will be used during machine check exception to recover from MMIO SCOM UE.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      55672ecf
  6. 15 1月, 2014 1 次提交
  7. 05 12月, 2013 5 次提交
    • M
      powerpc/powernv: Infrastructure to read opal messages in generic format. · 24366360
      Mahesh Salgaonkar 提交于
      Opal now has a new messaging infrastructure to push the messages to
      linux in a generic format for different type of messages using only one
      event bit. The format of the opal message is as below:
      
      struct opal_msg {
              uint32_t msg_type;
      	uint32_t reserved;
      	uint64_t params[8];
      };
      
      This patch allows clients to subscribe for notification for specific
      message type. It is upto the subscriber to decipher the messages who showed
      interested in receiving specific message type.
      
      The interface to subscribe for notification is:
      
      	int opal_message_notifier_register(enum OpalMessageType msg_type,
                                              struct notifier_block *nb)
      
      The notifier will fetch the opal message when available and notify the
      subscriber with message type and the opal message. It is subscribers
      responsibility to copy the message data before returning from notifier
      callback.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      24366360
    • M
      powerpc/powernv: Machine check exception handling. · b63a0ffe
      Mahesh Salgaonkar 提交于
      Add basic error handling in machine check exception handler.
      
      - If MSR_RI isn't set, we can not recover.
      - Check if disposition set to OpalMCE_DISPOSITION_RECOVERED.
      - Check if address at fault is inside kernel address space, if not then send
        SIGBUS to process if we hit exception when in userspace.
      - If address at fault is not provided then and if we get a synchronous machine
        check while in userspace then kill the task.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b63a0ffe
    • M
      powerpc/powernv: Remove machine check handling in OPAL. · 28446de2
      Mahesh Salgaonkar 提交于
      Now that we are ready to handle machine check directly in linux, do not
      register with firmware to handle machine check exception.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      28446de2
    • M
      powerpc/book3s: Queue up and process delayed MCE events. · b5ff4211
      Mahesh Salgaonkar 提交于
      When machine check real mode handler can not continue into host kernel
      in V mode, it returns from the interrupt and we loose MCE event which
      never gets logged. In such a situation queue up the MCE event so that
      we can log it later when we get back into host kernel with r1 pointing to
      kernel stack e.g. during syscall exit.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      b5ff4211
    • M
      powerpc/book3s: Decode and save machine check event. · 36df96f8
      Mahesh Salgaonkar 提交于
      Now that we handle machine check in linux, the MCE decoding should also
      take place in linux host. This info is crucial to log before we go down
      in case we can not handle the machine check errors. This patch decodes
      and populates a machine check event which contain high level meaning full
      MCE information.
      
      We do this in real mode C code with ME bit on. The MCE information is still
      available on emergency stack (in pt_regs structure format). Even if we take
      another exception at this point the MCE early handler will allocate a new
      stack frame on top of current one. So when we return back here we still have
      our MCE information safe on current stack.
      
      We use per cpu buffer to save high level MCE information. Each per cpu buffer
      is an array of machine check event structure indexed by per cpu counter
      mce_nest_count. The mce_nest_count is incremented every time we enter
      machine check early handler in real mode to get the current free slot
      (index = mce_nest_count - 1). The mce_nest_count is decremented once the
      MCE info is consumed by virtual mode machine exception handler.
      
      This patch provides save_mce_event(), get_mce_event() and release_mce_event()
      generic routines that can be used by machine check handlers to populate and
      retrieve the event. The routine release_mce_event() will free the event slot so
      that it can be reused. Caller can invoke get_mce_event() with a release flag
      either to release the event slot immediately OR keep it so that it can be
      fetched again. The event slot can be also released anytime by invoking
      release_mce_event().
      
      This patch also updates kvm code to invoke get_mce_event to retrieve generic
      mce event rather than paca->opal_mce_evt.
      
      The KVM code always calls get_mce_event() with release flags set to false so
      that event is available for linus host machine
      
      If machine check occurs while we are in guest, KVM tries to handle the error.
      If KVM is able to handle MC error successfully, it enters the guest and
      delivers the machine check to guest. If KVM is not able to handle MC error, it
      exists the guest and passes the control to linux host machine check handler
      which then logs MC event and decides how to handle it in linux host. In failure
      case, KVM needs to make sure that the MC event is available for linux host to
      consume. Hence KVM always calls get_mce_event() with release flags set to false
      and later it invokes release_mce_event() only if it succeeds to handle error.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      36df96f8
  8. 30 10月, 2013 2 次提交
  9. 11 10月, 2013 4 次提交
  10. 10 10月, 2013 1 次提交
    • R
      powerpc: add explicit OF includes · 26a2056e
      Rob Herring 提交于
      When removing prom.h include by of.h, several OF headers will no longer
      be implicitly included. Add explicit includes of of_*.h as needed.
      Signed-off-by: NRob Herring <rob.herring@calxeda.com>
      Acked-by: NGrant Likely <grant.likely@linaro.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Anatolij Gustschin <agust@denx.de>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: Olof Johansson <olof@lixom.net>
      Cc: linuxppc-dev@lists.ozlabs.org
      26a2056e
  11. 14 8月, 2013 2 次提交
  12. 21 6月, 2013 1 次提交
  13. 14 5月, 2013 1 次提交
  14. 10 5月, 2013 1 次提交
  15. 08 5月, 2013 1 次提交
    • B
      powerpc/powernv: Properly drop characters if console is closed · 1de1455f
      Benjamin Herrenschmidt 提交于
      If the firmware returns an error such as "closed" (or hardware
      error), we should drop characters.
      
      Currently we only do that when a firmware compatible with OPAL v2
      APIs is detected, in the code that calls opal_console_write_buffer_space(),
      which didn't exist with OPAL v1 (or didn't work).
      
      However, when enabling early debug consoles, the flag indicating
      that v2 is supported isn't set yet, causing us, in case of errors
      or closed console, to spin forever.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1de1455f
  16. 06 5月, 2013 1 次提交
  17. 20 9月, 2011 4 次提交