1. 28 8月, 2013 1 次提交
    • P
      KVM: PPC: Book3S PR: Make instruction fetch fallback work for system calls · 8b23de29
      Paul Mackerras 提交于
      It turns out that if we exit the guest due to a hcall instruction (sc 1),
      and the loading of the instruction in the guest exit path fails for any
      reason, the call to kvmppc_ld() in kvmppc_get_last_inst() fetches the
      instruction after the hcall instruction rather than the hcall itself.
      This in turn means that the instruction doesn't get recognized as an
      hcall in kvmppc_handle_exit_pr() but gets passed to the guest kernel
      as a sc instruction.  That usually results in the guest kernel getting
      a return code of 38 (ENOSYS) from an hcall, which often triggers a
      BUG_ON() or other failure.
      
      This fixes the problem by adding a new variant of kvmppc_get_last_inst()
      called kvmppc_get_last_sc(), which fetches the instruction if necessary
      from pc - 4 rather than pc.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8b23de29
  2. 11 7月, 2013 1 次提交
  3. 10 7月, 2013 1 次提交
    • P
      KVM: PPC: Book3S HV: Correct tlbie usage · 54480501
      Paul Mackerras 提交于
      This corrects the usage of the tlbie (TLB invalidate entry) instruction
      in HV KVM.  The tlbie instruction changed between PPC970 and POWER7.
      On the PPC970, the bit to select large vs. small page is in the instruction,
      not in the RB register value.  This changes the code to use the correct
      form on PPC970.
      
      On POWER7 we were calculating the AVAL (Abbreviated Virtual Address, Lower)
      field of the RB value incorrectly for 64k pages.  This fixes it.
      
      Since we now have several cases to handle for the tlbie instruction, this
      factors out the code to do a sequence of tlbies into a new function,
      do_tlbies(), and calls that from the various places where the code was
      doing tlbie instructions inline.  It also makes kvmppc_h_bulk_remove()
      use the same global_invalidates() function for determining whether to do
      local or global TLB invalidations as is used in other places, for
      consistency, and also to make sure that kvm->arch.need_tlb_flush gets
      updated properly.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      54480501
  4. 08 7月, 2013 2 次提交
  5. 30 6月, 2013 1 次提交
    • P
      KVM: PPC: Book3S PR: Allow guest to use 1TB segments · 0f296829
      Paul Mackerras 提交于
      With this, the guest can use 1TB segments as well as 256MB segments.
      Since we now have the situation where a single emulated guest segment
      could correspond to multiple shadow segments (as the shadow segments
      are still 256MB segments), this adds a new kvmppc_mmu_flush_segment()
      to scan for all shadow segments that need to be removed.
      
      This restructures the guest HPT (hashed page table) lookup code to
      use the correct hashing and matching functions for HPTEs within a
      1TB segment.  We use the standard hpt_hash() function instead of
      open-coding the hash calculation, and we use HPTE_V_COMPARE() with
      an AVPN value that has the B (segment size) field included.  The
      calculation of avpn is done a little earlier since it doesn't change
      in the loop starting at the do_second label.
      
      The computation in kvmppc_mmu_book3s_64_esid_to_vsid() changes so that
      it returns a 256MB VSID even if the guest SLB entry is a 1TB entry.
      This is because the users of this function are creating 256MB SLB
      entries.  We set a new VSID_1T flag so that entries created from 1T
      segments don't collide with entries from 256MB segments.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      0f296829
  6. 29 6月, 2013 1 次提交
  7. 26 6月, 2013 1 次提交
  8. 19 6月, 2013 1 次提交
  9. 15 6月, 2013 1 次提交
    • M
      powerpc: Fix stack overflow crash in resume_kernel when ftracing · 0e37739b
      Michael Ellerman 提交于
      It's possible for us to crash when running with ftrace enabled, eg:
      
        Bad kernel stack pointer bffffd12 at c00000000000a454
        cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40]
            pc: c00000000000a454: resume_kernel+0x34/0x60
            lr: c00000000000335c: performance_monitor_common+0x15c/0x180
            sp: bffffd12
           msr: 8000000000001032
           dar: bffffd12
         dsisr: 42000000
      
      If we look at current's stack (paca->__current->stack) we see it is
      equal to c0000002ecab0000. Our stack is 16K, and comparing to
      paca->kstack (c0000002ecab3e30) we can see that we have overflowed our
      kernel stack. This leads to us writing over our struct thread_info, and
      in this case we have corrupted thread_info->flags and set
      _TIF_EMULATE_STACK_STORE.
      
      Dumping the stack we see:
      
        3:mon> t c0000002ecab0000
        [c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70
        [c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180
        --- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30
        [c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable)
        [c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130
        [c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
        [c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90
        [c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34
        [c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300
        [c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180
        --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
        [c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable)
        [c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280
        [c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130
        [c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
        [c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40
        [c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34
        --- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
      
        ... and so on
      
      __ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry
      path. At that point the irq state is not consistent, ie. interrupts are
      hard disabled (by the exception entry), but the paca soft-enabled flag
      may be out of sync.
      
      This leads to the local_irq_restore() in trace_graph_entry() actually
      enabling interrupts, which we do not want. Because we have not yet
      reprogrammed the decrementer we immediately take another decrementer
      exception, and recurse.
      
      The fix is twofold. Firstly make sure we call DISABLE_INTS before
      calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles
      the irq state in the paca with the hardware, making it safe again to
      call local_irq_save/restore().
      
      Although that should be sufficient to fix the bug, we also mark the
      runlatch routines as notrace. They are called very early in the
      exception entry and we are asking for trouble tracing them. They are
      also fairly uninteresting and tracing them just adds unnecessary
      overhead.
      
      [ This regression was introduced by fe1952fc
        "powerpc: Rework runlatch code" by myself --BenH
      ]
      
      CC: <stable@vger.kernel.org> [v3.4+]
      Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0e37739b
  10. 11 6月, 2013 1 次提交
  11. 10 6月, 2013 1 次提交
  12. 04 6月, 2013 1 次提交
  13. 01 6月, 2013 6 次提交
  14. 28 5月, 2013 1 次提交
  15. 24 5月, 2013 2 次提交
  16. 14 5月, 2013 6 次提交
  17. 10 5月, 2013 2 次提交
  18. 08 5月, 2013 1 次提交
  19. 07 5月, 2013 1 次提交
  20. 06 5月, 2013 5 次提交
  21. 02 5月, 2013 3 次提交