1. 22 8月, 2016 1 次提交
  2. 09 8月, 2016 1 次提交
  3. 01 8月, 2016 1 次提交
  4. 17 7月, 2016 2 次提交
  5. 15 7月, 2016 2 次提交
  6. 23 6月, 2016 1 次提交
    • M
      powerpc: Fix faults caused by radix patching of SLB miss handler · 6e914ee6
      Michael Ellerman 提交于
      As part of the Radix MMU support we added some feature sections in the
      SLB miss handler. These are intended to catch the case that we
      incorrectly take an SLB miss when Radix is enabled, and instead of
      crashing weirdly they bail out to a well defined exit path and trigger
      an oops.
      
      However the way they were written meant the bailout case was enabled by
      default until we did CPU feature patching.
      
      On powermacs the early debug prints in setup_system() can cause an SLB
      miss, which happens before code patching, and so the SLB miss handler
      would incorrectly bailout and crash during boot.
      
      Fix it by inverting the sense of the feature section, so that the code
      which is in place at boot is correct for the hash case. Once we
      determine we are using Radix - which will never happen on a powermac -
      only then do we patch in the bailout case which unconditionally jumps.
      
      Fixes: caca285e ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code")
      Reported-by: NDenis Kirjanov <kda@linux-powerpc.org>
      Tested-by: NDenis Kirjanov <kda@linux-powerpc.org>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6e914ee6
  7. 20 6月, 2016 1 次提交
    • M
      KVM: PPC: Book3S HV: Fix TB corruption in guest exit path on HMI interrupt · fd7bacbc
      Mahesh Salgaonkar 提交于
      When a guest is assigned to a core it converts the host Timebase (TB)
      into guest TB by adding guest timebase offset before entering into
      guest. During guest exit it restores the guest TB to host TB. This means
      under certain conditions (Guest migration) host TB and guest TB can differ.
      
      When we get an HMI for TB related issues the opal HMI handler would
      try fixing errors and restore the correct host TB value. With no guest
      running, we don't have any issues. But with guest running on the core
      we run into TB corruption issues.
      
      If we get an HMI while in the guest, the current HMI handler invokes opal
      hmi handler before forcing guest to exit. The guest exit path subtracts
      the guest TB offset from the current TB value which may have already
      been restored with host value by opal hmi handler. This leads to incorrect
      host and guest TB values.
      
      With split-core, things become more complex. With split-core, TB also gets
      split and each subcore gets its own TB register. When a hmi handler fixes
      a TB error and restores the TB value, it affects all the TB values of
      sibling subcores on the same core. On TB errors all the thread in the core
      gets HMI. With existing code, the individual threads call opal hmi handle
      independently which can easily throw TB out of sync if we have guest
      running on subcores. Hence we will need to co-ordinate with all the
      threads before making opal hmi handler call followed by TB resync.
      
      This patch introduces a sibling subcore state structure (shared by all
      threads in the core) in paca which holds information about whether sibling
      subcores are in Guest mode or host mode. An array in_guest[] of size
      MAX_SUBCORE_PER_CORE=4 is used to maintain the state of each subcore.
      The subcore id is used as index into in_guest[] array. Only primary
      thread entering/exiting the guest is responsible to set/unset its
      designated array element.
      
      On TB error, we get HMI interrupt on every thread on the core. Upon HMI,
      this patch will now force guest to vacate the core/subcore. Primary
      thread from each subcore will then turn off its respective bit
      from the above bitmap during the guest exit path just after the
      guest->host partition switch is complete.
      
      All other threads that have just exited the guest OR were already in host
      will wait until all other subcores clears their respective bit.
      Once all the subcores turn off their respective bit, all threads will
      will make call to opal hmi handler.
      
      It is not necessary that opal hmi handler would resync the TB value for
      every HMI interrupts. It would do so only for the HMI caused due to
      TB errors. For rest, it would not touch TB value. Hence to make things
      simpler, primary thread would call TB resync explicitly once for each
      core immediately after opal hmi handler instead of subtracting guest
      offset from TB. TB resync call will restore the TB with host value.
      Thus we can be sure about the TB state.
      
      One of the primary threads exiting the guest will take up the
      responsibility of calling TB resync. It will use one of the top bits
      (bit 63) from subcore state flags bitmap to make the decision. The first
      primary thread (among the subcores) that is able to set the bit will
      have to call the TB resync. Rest all other threads will wait until TB
      resync is complete.  Once TB resync is complete all threads will then
      proceed.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      fd7bacbc
  8. 11 5月, 2016 2 次提交
  9. 21 4月, 2016 2 次提交
    • H
      powerpc/book3s64: Remove __end_handlers marker · 057b6d7e
      Hari Bathini 提交于
      The __end_handlers marker was intended to mark down upto code that gets
      called from exception prologs. But that hasn't kept pace with code
      changes. Case in point, slb_miss_realmode being called from exception
      prolog code but isn't below __end_handlers marker. So, __end_handlers
      marker is as good as a comment but could be misleading at times if it
      isn't in sync with the code, as is the case now. So, let us avoid this
      confusion by having a better comment and removing __end_handlers marker
      altogether.
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      057b6d7e
    • H
      powerpc/book3s64: Fix branching to OOL handlers in relocatable kernel · 8ed8ab40
      Hari Bathini 提交于
      Some of the interrupt vectors on 64-bit POWER server processors are only
      32 bytes long (8 instructions), which is not enough for the full
      first-level interrupt handler. For these we need to branch to an
      out-of-line (OOL) handler. But when we are running a relocatable kernel,
      interrupt vectors till __end_interrupts marker are copied down to real
      address 0x100. So, branching to labels (ie. OOL handlers) outside this
      section must be handled differently (see LOAD_HANDLER()), considering
      relocatable kernel, which would need at least 4 instructions.
      
      However, branching from interrupt vector means that we corrupt the
      CFAR (come-from address register) on POWER7 and later processors as
      mentioned in commit 1707dd16. So, EXCEPTION_PROLOG_0 (6 instructions)
      that contains the part up to the point where the CFAR is saved in the
      PACA should be part of the short interrupt vectors before we branch out
      to OOL handlers.
      
      But as mentioned already, there are interrupt vectors on 64-bit POWER
      server processors that are only 32 bytes long (like vectors 0x4f00,
      0x4f20, etc.), which cannot accomodate the above two cases at the same
      time owing to space constraint. Currently, in these interrupt vectors,
      we simply branch out to OOL handlers, without using LOAD_HANDLER(),
      which leaves us vulnerable when running a relocatable kernel (eg. kdump
      case). While this has been the case for sometime now and kdump is used
      widely, we were fortunate not to see any problems so far, for three
      reasons:
      
        1. In almost all cases, production kernel (relocatable) is used for
           kdump as well, which would mean that crashed kernel's OOL handler
           would be at the same place where we end up branching to, from short
           interrupt vector of kdump kernel.
        2. Also, OOL handler was unlikely the reason for crash in almost all
           the kdump scenarios, which meant we had a sane OOL handler from
           crashed kernel that we branched to.
        3. On most 64-bit POWER server processors, page size is large enough
           that marking interrupt vector code as executable (see commit
           429d2e83) leads to marking OOL handler code from crashed kernel,
           that sits right below interrupt vector code from kdump kernel, as
           executable as well.
      
      Let us fix this by moving the __end_interrupts marker down past OOL
      handlers to make sure that we also copy OOL handlers to real address
      0x100 when running a relocatable kernel.
      
      This fix has been tested successfully in kdump scenario, on an LPAR with
      4K page size by using different default/production kernel and kdump
      kernel.
      
      Also tested by manually corrupting the OOL handlers in the first kernel
      and then kdump'ing, and then causing the OOL handlers to fire - mpe.
      
      Fixes: c1fb6816 ("powerpc: Add relocation on exception vector handlers")
      Cc: stable@vger.kernel.org
      Signed-off-by: NHari Bathini <hbathini@linux.vnet.ibm.com>
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8ed8ab40
  10. 11 4月, 2016 1 次提交
    • M
      powerpc/mm: Remove long disabled SLB code · 1f4c66e8
      Michael Ellerman 提交于
      We have a bunch of SLB related code in the tree which is there to handle
      dynamic VSIDs - but currently it's all disabled at compile time. The
      comments say "Keep that around for when we re-implement dynamic VSIDs".
      
      But that was over 10 years ago (commit 3c726f8d ("[PATCH] ppc64:
      support 64k pages")). The chance that it would still work unchanged is
      minimal, and in the meantime it's confusing to folks browsing/grepping
      the code. If we ever want to re-instate it, it's in the git history.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      1f4c66e8
  11. 17 12月, 2015 2 次提交
    • M
      powerpc/kernel: Combine vec/loc for STD_EXCEPTION_PSERIES · 2613265c
      Michael Ellerman 提交于
      The STD_EXCEPTION_PSERIES macro takes both a vector number, and a
      location (memory address). However both are always identical, so combine
      them to save repeating ourselves.
      
      This does mean an exception handler must always exist at the location in
      memory that matches its vector number. But that's OK because this is the
      "STD" macro (standard), which does exactly that. We have other macros
      for the other cases, eg. STD_EXCEPTION_PSERIES_OOL (out of line).
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2613265c
    • M
      powerpc/kernel: Drop HMT_MEDIUM_PPR_DISCARD · d6265aea
      Michael Ellerman 提交于
      HMT_MEDIUM_PPR_DISCARD is a macro which is present at the start of most
      of our first level exception handlers. It conditionally executes a
      HMT_MEDIUM instruction, which sets the processor priority to medium.
      
      On on modern systems, ie. Power7 and later, it is nop'ed out at boot.
      All it does is make the exception vectors more cramped, and consume 4
      bytes of icache.
      
      On old systems it has the effect of boosting the processor priority at
      the start of exception processing. If we were previously in the idle
      loop for example, we may be at low or very low priority. This is
      desirable as we want to process the exception as fast as possible.
      
      However looking closely at the generated code, we see that in all cases
      we execute another HMT_MEDIUM just four instructions later. With code
      patching applied, the final code on an old (Power6) system will look
      like, eg:
      
        c000000000000300 <data_access_pSeries>:
        c000000000000300:	7c 42 13 78	mr	r2,r2		<-
        c000000000000304:	7d b2 43 a6	mtsprg	2,r13
        c000000000000308:	7d b1 42 a6	mfsprg	r13,1
        c00000000000030c:	f9 2d 00 80	std	r9,128(r13)
        c000000000000310:	60 00 00 00	nop
        c000000000000314:	7c 42 13 78	mr	r2,r2		<-
      
      So I suggest that the added code complexity of HMT_MEDIUM_PPR_DISCARD is
      not justified by the benefit of boosting the processor priority for the
      duration of four instructions, and therefore we drop it.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d6265aea
  12. 14 12月, 2015 1 次提交
  13. 01 12月, 2015 1 次提交
    • P
      powerpc/64: Include KVM guest test in all interrupt vectors · 31a40e2b
      Paul Mackerras 提交于
      Currently, if HV KVM is configured but PR KVM isn't, we don't include
      a test to see whether we were interrupted in KVM guest context for the
      set of interrupts which get delivered directly to the guest by hardware
      if they occur in the guest.  This includes things like program
      interrupts.
      
      However, the recent bug where userspace could set the MSR for a VCPU
      to have an illegal value in the TS field, and thus cause a TM Bad Thing
      type of program interrupt on the hrfid that enters the guest, showed that
      we can never be completely sure that these interrupts can never occur
      in the guest entry/exit code.  If one of these interrupts does happen
      and we have HV KVM configured but not PR KVM, then we end up trying to
      run the handler in the host with the MMU set to the guest MMU context,
      which generally ends badly.
      
      Thus, for robustness it is better to have the test in every interrupt
      vector, so that if some way is found to trigger some interrupt in the
      guest entry/exit path, we can handle it without immediately crashing
      the host.
      
      This means that the distinction between KVMTEST and KVMTEST_PR goes
      away.  Thus we delete KVMTEST_PR and associated macros and use KVMTEST
      everywhere that we previously used either KVMTEST_PR or KVMTEST.  It
      also means that SOFTEN_TEST_HV_201 becomes the same as SOFTEN_TEST_PR,
      so we deleted SOFTEN_TEST_HV_201 and use SOFTEN_TEST_PR instead.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      31a40e2b
  14. 02 6月, 2015 2 次提交
  15. 23 3月, 2015 1 次提交
  16. 15 12月, 2014 2 次提交
    • S
      powernv/powerpc: Add winkle support for offline cpus · 77b54e9f
      Shreyas B. Prabhu 提交于
      Winkle is a deep idle state supported in power8 chips. A core enters
      winkle when all the threads of the core enter winkle. In this state
      power supply to the entire chiplet i.e core, private L2 and private L3
      is turned off. As a result it gives higher powersavings compared to
      sleep.
      
      But entering winkle results in a total hypervisor state loss. Hence the
      hypervisor context has to be preserved before entering winkle and
      restored upon wake up.
      
      Power-on Reset Engine (PORE) is a dedicated engine which is responsible
      for powering on the chiplet during wake up. It can be programmed to
      restore the register contests of a few specific registers. This patch
      uses PORE to restore register state wherever possible and uses stack to
      save and restore rest of the necessary registers.
      
      With hypervisor state restore things fall under three categories-
      per-core state, per-subcore state and per-thread state. To manage this,
      extend the infrastructure introduced for sleep. Mainly we add a paca
      variable subcore_sibling_mask. Using this and the core_idle_state we can
      distingush first thread in core and subcore.
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      77b54e9f
    • S
      powernv/cpuidle: Redesign idle states management · 7cba160a
      Shreyas B. Prabhu 提交于
      Deep idle states like sleep and winkle are per core idle states. A core
      enters these states only when all the threads enter either the
      particular idle state or a deeper one. There are tasks like fastsleep
      hardware bug workaround and hypervisor core state save which have to be
      done only by the last thread of the core entering deep idle state and
      similarly tasks like timebase resync, hypervisor core register restore
      that have to be done only by the first thread waking up from these
      state.
      
      The current idle state management does not have a way to distinguish the
      first/last thread of the core waking/entering idle states. Tasks like
      timebase resync are done for all the threads. This is not only is
      suboptimal, but can cause functionality issues when subcores and kvm is
      involved.
      
      This patch adds the necessary infrastructure to track idle states of
      threads in a per-core structure. It uses this info to perform tasks like
      fastsleep workaround and timebase resync only once per core.
      Signed-off-by: NShreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
      Originally-by: NPreeti U. Murthy <preeti@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: linux-pm@vger.kernel.org
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7cba160a
  17. 08 12月, 2014 1 次提交
    • P
      powerpc/powernv: Return to cpu offline loop when finished in KVM guest · 56548fc0
      Paul Mackerras 提交于
      When a secondary hardware thread has finished running a KVM guest, we
      currently put that thread into nap mode using a nap instruction in
      the KVM code.  This changes the code so that instead of doing a nap
      instruction directly, we instead cause the call to power7_nap() that
      put the thread into nap mode to return.  The reason for doing this is
      to avoid having the KVM code having to know what low-power mode to
      put the thread into.
      
      In the case of a secondary thread used to run a KVM guest, the thread
      will be offline from the point of view of the host kernel, and the
      relevant power7_nap() call is the one in pnv_smp_cpu_disable().
      In this case we don't want to clear pending IPIs in the offline loop
      in that function, since that might cause us to miss the wakeup for
      the next time the thread needs to run a guest.  To tell whether or
      not to clear the interrupt, we use the SRR1 value returned from
      power7_nap(), and check if it indicates an external interrupt.  We
      arrange that the return from power7_nap() when we have finished running
      a guest returns 0, so pending interrupts don't get flushed in that
      case.
      
      Note that it is important a secondary thread that has finished
      executing in the guest, or that didn't have a guest to run, should
      not return to power7_nap's caller while the kvm_hstate.hwthread_req
      flag in the PACA is non-zero, because the return from power7_nap
      will reenable the MMU, and the MMU might still be in guest context.
      In this situation we spin at low priority in real mode waiting for
      hwthread_req to become zero.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      56548fc0
  18. 05 12月, 2014 1 次提交
    • A
      powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault · aefa5688
      Aneesh Kumar K.V 提交于
      upatepp can get called for a nohpte fault when we find from the linux
      page table that the translation was hashed before. In that case
      we are sure that there is no existing translation, hence we could
      avoid doing tlbie.
      
      We could possibly race with a parallel fault filling the TLB. But
      that should be ok because updatepp is only ever relaxing permissions.
      We also look at linux pte permission bits when filling hash pte
      permission bits. We also hold the linux pte busy bits while
      inserting/updating a hashpte entry, hence a paralle update of
      linux pte is not possible. On the other hand mprotect involves
      ptep_modify_prot_start which cause a hpte invalidate and not updatepp.
      
      Performance number:
      We use randbox_access_bench written by Anton.
      
      Kernel with THP disabled and smaller hash page table size.
      
          86.60%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_updatepp
           2.10%  random_access_b  random_access_bench              [.] doit
           1.99%  random_access_b  [kernel.kallsyms]                [k] .do_raw_spin_lock
           1.85%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
           1.26%  random_access_b  [kernel.kallsyms]                [k] .native_flush_hash_range
           1.18%  random_access_b  [kernel.kallsyms]                [k] .__delay
           0.69%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
           0.37%  random_access_b  [kernel.kallsyms]                [k] .clear_user_page
           0.34%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
           0.32%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
           0.30%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
      
      With Fix:
      
          27.54%  random_access_b  random_access_bench              [.] doit
          22.90%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
           5.76%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
           5.20%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
           5.12%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
           4.80%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
           3.31%  random_access_b  [kernel.kallsyms]                [k] data_access_common
           1.84%  random_access_b  [kernel.kallsyms]                [k] .trace_hardirqs_on_caller
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      aefa5688
  19. 02 12月, 2014 1 次提交
  20. 12 11月, 2014 1 次提交
    • S
      powerpc: Save/restore PPR for KVM hypercalls · 8b91a255
      Suresh E. Warrier 提交于
      The system call FLIH (first-level interrupt handler) at 0xc00
      unconditionally sets hardware priority to medium. For hypercalls, this
      means we lose guest OS priority. The front end (do_kvm_0x**) to the
      KVM interrupt handler always assumes that PPR priority is saved in
      PACA exception save area, so it copies this to the kvm_hstate
      structure. For hypercalls, this would be the saved priority from any
      previous exception. Eventually, the guest gets resumed with an
      incorrect priority.
      
      The fix is to save the PPR priority in PACA exception save area before
      switching HMT priorities in the FLIH so that existing code described above
      in the KVM interrupt handler can copy it from there into the VCPU's saved
      context.
      Signed-off-by: NSuresh Warrier <warrier@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [mpe: Dropped HMT_MEDIUM_PPR_DISCARD and reworded comment]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8b91a255
  21. 10 10月, 2014 1 次提交
  22. 13 8月, 2014 1 次提交
    • G
      powerpc: Fix "attempt to move .org backwards" error · 11d54904
      Guenter Roeck 提交于
      Once again, we see
      
      arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
      arch/powerpc/kernel/exceptions-64s.S:865: Error: attempt to move .org backwards
      arch/powerpc/kernel/exceptions-64s.S:866: Error: attempt to move .org backwards
      arch/powerpc/kernel/exceptions-64s.S:890: Error: attempt to move .org backwards
      
      when compiling ppc:allmodconfig.
      
      This time the problem has been caused by to commit 0869b6fd
      ("powerpc/book3s: Add basic infrastructure to handle HMI in Linux"),
      which adds functions hmi_exception_early and hmi_exception_after_realmode
      into a critical (size-limited) code area, even though that does not appear
      to be necessary.
      
      Move those functions to a non-critical area of the file.
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      11d54904
  23. 05 8月, 2014 1 次提交
  24. 28 7月, 2014 3 次提交
  25. 12 6月, 2014 1 次提交
  26. 11 6月, 2014 2 次提交
    • M
      powerpc/book3s: Add stack overflow check in machine check handler. · e75ad93a
      Mahesh Salgaonkar 提交于
      Currently machine check handler does not check for stack overflow for
      nested machine check. If we hit another MCE while inside the machine check
      handler repeatedly from same address then we get into risk of stack
      overflow which can cause huge memory corruption. This patch limits the
      nested MCE level to 4 and panic when we cross level 4.
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e75ad93a
    • M
      powerpc/book3s: Fix machine check handling for unhandled errors · 2749a2f2
      Mahesh Salgaonkar 提交于
      Current code does not check for unhandled/unrecovered errors and return from
      interrupt if it is recoverable exception which in-turn triggers same machine
      check exception in a loop causing hypervisor to be unresponsive.
      
      This patch fixes this situation and forces hypervisor to panic for
      unhandled/unrecovered errors.
      
      This patch also fixes another issue where unrecoverable_exception routine
      was called in real mode in case of unrecoverable exception (MSR_RI = 0).
      This causes another exception vector 0x300 (data access) during system crash
      leading to confusion while debugging cause of the system crash.
      
      Also turn ME bit off while going down, so that when another MCE is hit during
      panic path, system will checkstop and hypervisor will get restarted cleanly
      by SP.
      
      With the above fixes we now throw correct console messages (see below) while
      crashing the system in case of unhandled/unrecoverable machine checks.
      
      --------------
      Severe Machine check interrupt [[Not recovered]
        Initiator: CPU
        Error type: UE [Instruction fetch]
          Effective address: 0000000030002864
      Oops: Machine check, sig: 7 [#1]
      SMP NR_CPUS=2048 NUMA PowerNV
      Modules linked in: bork(O) bridge stp llc kvm [last unloaded: bork]
      CPU: 36 PID: 55162 Comm: bash Tainted: G           O 3.14.0mce #1
      task: c000002d72d022d0 ti: c000000007ec0000 task.ti: c000002d72de4000
      NIP: 0000000030002864 LR: 00000000300151a4 CTR: 000000003001518c
      REGS: c000000007ec3d80 TRAP: 0200   Tainted: G           O  (3.14.0mce)
      MSR: 9000000000041002 <SF,HV,ME,RI>  CR: 28222848  XER: 20000000
      CFAR: 0000000030002838 DAR: d0000000004d0000 DSISR: 00000000 SOFTE: 1
      GPR00: 000000003001512c 0000000031f92cb0 0000000030078af0 0000000030002864
      GPR04: d0000000004d0000 0000000000000000 0000000030002864 ffffffffffffffc9
      GPR08: 0000000000000024 0000000030008af0 000000000000002c c00000000150e728
      GPR12: 9000000000041002 0000000031f90000 0000000010142550 0000000040000000
      GPR16: 0000000010143cdc 0000000000000000 00000000101306fc 00000000101424dc
      GPR20: 00000000101424e0 000000001013c6f0 0000000000000000 0000000000000000
      GPR24: 0000000010143ce0 00000000100f6440 c000002d72de7e00 c000002d72860250
      GPR28: c000002d72860240 c000002d72ac0038 0000000000000008 0000000000040000
      NIP [0000000030002864] 0x30002864
      LR [00000000300151a4] 0x300151a4
      Call Trace:
      Instruction dump:
      XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
      XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
      ---[ end trace 7285f0beac1e29d3 ]---
      
      Sending IPI to other CPUs
      IPI complete
      OPAL V3 detected !
      --------------
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      2749a2f2
  27. 23 4月, 2014 4 次提交