1. 25 9月, 2014 1 次提交
  2. 28 7月, 2014 1 次提交
  3. 22 7月, 2014 1 次提交
    • A
      powerpc: subpage_protect: Increase the array size to take care of 64TB · dad6f37c
      Aneesh Kumar K.V 提交于
      We now support TASK_SIZE of 16TB, hence the array should be 8.
      
      Fixes the below crash:
      
      Unable to handle kernel paging request for data at address 0x000100bd
      Faulting instruction address: 0xc00000000004f914
      cpu 0x13: Vector: 300 (Data Access) at [c000000fea75fa90]
          pc: c00000000004f914: .sys_subpage_prot+0x2d4/0x5c0
          lr: c00000000004fb5c: .sys_subpage_prot+0x51c/0x5c0
          sp: c000000fea75fd10
         msr: 9000000000009032
         dar: 100bd
       dsisr: 40000000
        current = 0xc000000fea6ae490
        paca    = 0xc00000000fb8ab00   softe: 0        irq_happened: 0x00
          pid   = 8237, comm = a.out
      enter ? for help
      [c000000fea75fe30] c00000000000a164 syscall_exit+0x0/0x98
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      dad6f37c
  4. 11 10月, 2013 1 次提交
  5. 25 6月, 2013 1 次提交
  6. 21 6月, 2013 1 次提交
  7. 30 4月, 2013 4 次提交
  8. 17 3月, 2013 3 次提交
  9. 06 12月, 2012 1 次提交
    • P
      KVM: PPC: Book3S HV: Handle guest-caused machine checks on POWER7 without panicking · b4072df4
      Paul Mackerras 提交于
      Currently, if a machine check interrupt happens while we are in the
      guest, we exit the guest and call the host's machine check handler,
      which tends to cause the host to panic.  Some machine checks can be
      triggered by the guest; for example, if the guest creates two entries
      in the SLB that map the same effective address, and then accesses that
      effective address, the CPU will take a machine check interrupt.
      
      To handle this better, when a machine check happens inside the guest,
      we call a new function, kvmppc_realmode_machine_check(), while still in
      real mode before exiting the guest.  On POWER7, it handles the cases
      that the guest can trigger, either by flushing and reloading the SLB,
      or by flushing the TLB, and then it delivers the machine check interrupt
      directly to the guest without going back to the host.  On POWER7, the
      OPAL firmware patches the machine check interrupt vector so that it
      gets control first, and it leaves behind its analysis of the situation
      in a structure pointed to by the opal_mc_evt field of the paca.  The
      kvmppc_realmode_machine_check() function looks at this, and if OPAL
      reports that there was no error, or that it has handled the error, we
      also go straight back to the guest with a machine check.  We have to
      deliver a machine check to the guest since the machine check interrupt
      might have trashed valid values in SRR0/1.
      
      If the machine check is one we can't handle in real mode, and one that
      OPAL hasn't already handled, or on PPC970, we exit the guest and call
      the host's machine check handler.  We do this by jumping to the
      machine_check_fwnmi label, rather than absolute address 0x200, because
      we don't want to re-execute OPAL's handler on POWER7.  On PPC970, the
      two are equivalent because address 0x200 just contains a branch.
      
      Then, if the host machine check handler decides that the system can
      continue executing, kvmppc_handle_exit() delivers a machine check
      interrupt to the guest -- once again to let the guest know that SRR0/1
      have been modified.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      [agraf: fix checkpatch warnings]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b4072df4
  10. 17 9月, 2012 5 次提交
  11. 28 3月, 2012 1 次提交
  12. 05 3月, 2012 1 次提交
    • P
      KVM: PPC: Implement MMIO emulation support for Book3S HV guests · 697d3899
      Paul Mackerras 提交于
      This provides the low-level support for MMIO emulation in Book3S HV
      guests.  When the guest tries to map a page which is not covered by
      any memslot, that page is taken to be an MMIO emulation page.  Instead
      of inserting a valid HPTE, we insert an HPTE that has the valid bit
      clear but another hypervisor software-use bit set, which we call
      HPTE_V_ABSENT, to indicate that this is an absent page.  An
      absent page is treated much like a valid page as far as guest hcalls
      (H_ENTER, H_REMOVE, H_READ etc.) are concerned, except of course that
      an absent HPTE doesn't need to be invalidated with tlbie since it
      was never valid as far as the hardware is concerned.
      
      When the guest accesses a page for which there is an absent HPTE, it
      will take a hypervisor data storage interrupt (HDSI) since we now set
      the VPM1 bit in the LPCR.  Our HDSI handler for HPTE-not-present faults
      looks up the hash table and if it finds an absent HPTE mapping the
      requested virtual address, will switch to kernel mode and handle the
      fault in kvmppc_book3s_hv_page_fault(), which at present just calls
      kvmppc_hv_emulate_mmio() to set up the MMIO emulation.
      
      This is based on an earlier patch by Benjamin Herrenschmidt, but since
      heavily reworked.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      697d3899
  13. 19 12月, 2011 1 次提交
  14. 20 9月, 2011 1 次提交
    • B
      powerpc: Hugetlb for BookE · 41151e77
      Becky Bruce 提交于
      Enable hugepages on Freescale BookE processors.  This allows the kernel to
      use huge TLB entries to map pages, which can greatly reduce the number of
      TLB misses and the amount of TLB thrashing experienced by applications with
      large memory footprints.  Care should be taken when using this on FSL
      processors, as the number of large TLB entries supported by the core is low
      (16-64) on current processors.
      
      The supported set of hugepage sizes include 4m, 16m, 64m, 256m, and 1g.
      Page sizes larger than the max zone size are called "gigantic" pages and
      must be allocated on the command line (and cannot be deallocated).
      
      This is currently only fully implemented for Freescale 32-bit BookE
      processors, but there is some infrastructure in the code for
      64-bit BooKE.
      Signed-off-by: NBecky Bruce <beckyb@kernel.crashing.org>
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      41151e77
  15. 12 7月, 2011 1 次提交
    • P
      KVM: PPC: Add support for Book3S processors in hypervisor mode · de56a948
      Paul Mackerras 提交于
      This adds support for KVM running on 64-bit Book 3S processors,
      specifically POWER7, in hypervisor mode.  Using hypervisor mode means
      that the guest can use the processor's supervisor mode.  That means
      that the guest can execute privileged instructions and access privileged
      registers itself without trapping to the host.  This gives excellent
      performance, but does mean that KVM cannot emulate a processor
      architecture other than the one that the hardware implements.
      
      This code assumes that the guest is running paravirtualized using the
      PAPR (Power Architecture Platform Requirements) interface, which is the
      interface that IBM's PowerVM hypervisor uses.  That means that existing
      Linux distributions that run on IBM pSeries machines will also run
      under KVM without modification.  In order to communicate the PAPR
      hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code
      to include/linux/kvm.h.
      
      Currently the choice between book3s_hv support and book3s_pr support
      (i.e. the existing code, which runs the guest in user mode) has to be
      made at kernel configuration time, so a given kernel binary can only
      do one or the other.
      
      This new book3s_hv code doesn't support MMIO emulation at present.
      Since we are running paravirtualized guests, this isn't a serious
      restriction.
      
      With the guest running in supervisor mode, most exceptions go straight
      to the guest.  We will never get data or instruction storage or segment
      interrupts, alignment interrupts, decrementer interrupts, program
      interrupts, single-step interrupts, etc., coming to the hypervisor from
      the guest.  Therefore this introduces a new KVMTEST_NONHV macro for the
      exception entry path so that we don't have to do the KVM test on entry
      to those exception handlers.
      
      We do however get hypervisor decrementer, hypervisor data storage,
      hypervisor instruction storage, and hypervisor emulation assist
      interrupts, so we have to handle those.
      
      In hypervisor mode, real-mode accesses can access all of RAM, not just
      a limited amount.  Therefore we put all the guest state in the vcpu.arch
      and use the shadow_vcpu in the PACA only for temporary scratch space.
      We allocate the vcpu with kzalloc rather than vzalloc, and we don't use
      anything in the kvmppc_vcpu_book3s struct, so we don't allocate it.
      We don't have a shared page with the guest, but we still need a
      kvm_vcpu_arch_shared struct to store the values of various registers,
      so we include one in the vcpu_arch struct.
      
      The POWER7 processor has a restriction that all threads in a core have
      to be in the same partition.  MMU-on kernel code counts as a partition
      (partition 0), so we have to do a partition switch on every entry to and
      exit from the guest.  At present we require the host and guest to run
      in single-thread mode because of this hardware restriction.
      
      This code allocates a hashed page table for the guest and initializes
      it with HPTEs for the guest's Virtual Real Memory Area (VRMA).  We
      require that the guest memory is allocated using 16MB huge pages, in
      order to simplify the low-level memory management.  This also means that
      we can get away without tracking paging activity in the host for now,
      since huge pages can't be paged or swapped.
      
      This also adds a few new exports needed by the book3s_hv code.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      de56a948
  16. 04 5月, 2011 1 次提交
  17. 30 3月, 2011 1 次提交
  18. 24 8月, 2010 1 次提交
  19. 23 7月, 2010 1 次提交
  20. 08 12月, 2009 1 次提交
  21. 02 12月, 2009 1 次提交
  22. 27 11月, 2009 1 次提交
  23. 30 10月, 2009 2 次提交
    • D
      powerpc/mm: Bring hugepage PTE accessor functions back into sync with normal accessors · 0895ecda
      David Gibson 提交于
      The hugepage arch code provides a number of hook functions/macros
      which mirror the functionality of various normal page pte access
      functions.  Various changes in the normal page accessors (in
      particular BenH's recent changes to the handling of lazy icache
      flushing and PAGE_EXEC) have caused the hugepage versions to get out
      of sync with the originals.  In some cases, this is a bug, at least on
      some MMU types.
      
      One of the reasons that some hooks were not identical to the normal
      page versions, is that the fact we're dealing with a hugepage needed
      to be passed down do use the correct dcache-icache flush function.
      This patch makes the main flush_dcache_icache_page() function hugepage
      aware (by checking for the PageCompound flag).  That in turn means we
      can make set_huge_pte_at() just a call to set_pte_at() bringing it
      back into sync.  As a bonus, this lets us remove the
      hash_huge_page_do_lazy_icache() function, replacing it with a call to
      the hash_page_do_lazy_icache() function it was based on.
      
      Some other hugepage pte access hooks - huge_ptep_get_and_clear() and
      huge_ptep_clear_flush() - are not so easily unified, but this patch at
      least brings them back into sync with the current versions of the
      corresponding normal page functions.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      0895ecda
    • D
      powerpc/mm: Allow more flexible layouts for hugepage pagetables · a4fe3ce7
      David Gibson 提交于
      Currently each available hugepage size uses a slightly different
      pagetable layout: that is, the bottem level table of pointers to
      hugepages is a different size, and may branch off from the normal page
      tables at a different level.  Every hugepage aware path that needs to
      walk the pagetables must therefore look up the hugepage size from the
      slice info first, and work out the correct way to walk the pagetables
      accordingly.  Future hardware is likely to add more possible hugepage
      sizes, more layout options and more mess.
      
      This patch, therefore reworks the handling of hugepage pagetables to
      reduce this complexity.  In the new scheme, instead of having to
      consult the slice mask, pagetable walking code can check a flag in the
      PGD/PUD/PMD entries to see where to branch off to hugepage pagetables,
      and the entry also contains the information (eseentially hugepage
      shift) necessary to then interpret that table without recourse to the
      slice mask.  This scheme can be extended neatly to handle multiple
      levels of self-describing "special" hugepage pagetables, although for
      now we assume only one level exists.
      
      This approach means that only the pagetable allocation path needs to
      know how the pagetables should be set out.  All other (hugepage)
      pagetable walking paths can just interpret the structure as they go.
      
      There already was a flag bit in PGD/PUD/PMD entries for hugepage
      directory pointers, but it was only used for debug.  We alter that
      flag bit to instead be a 0 in the MSB to indicate a hugepage pagetable
      pointer (normally it would be 1 since the pointer lies in the linear
      mapping).  This means that asm pagetable walking can test for (and
      punt on) hugepage pointers with the same test that checks for
      unpopulated page directory entries (beq becomes bge), since hugepage
      pointers will always be positive, and normal pointers always negative.
      
      While we're at it, we get rid of the confusing (and grep defeating)
      #defining of hugepte_shift to be the same thing as mmu_huge_psizes.
      Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a4fe3ce7
  24. 02 9月, 2009 1 次提交
    • B
      powerpc/pseries: Fix to handle slb resize across migration · 46db2f86
      Brian King 提交于
      The SLB can change sizes across a live migration, which was not
      being handled, resulting in possible machine crashes during
      migration if migrating to a machine which has a smaller max SLB
      size than the source machine. Fix this by first reducing the
      SLB size to the minimum possible value, which is 32, prior to
      migration. Then during the device tree update which occurs after
      migration, we make the call to ensure the SLB gets updated. Also
      add the slb_size to the lparcfg output so that the migration
      tools can check to make sure the kernel has this capability
      before allowing migration in scenarios where the SLB size will change.
      
      BenH: Fixed #include <asm/mmu-hash64.h> -> <asm/mmu.h> to avoid
            breaking ppc32 build
      Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      46db2f86
  25. 20 8月, 2009 1 次提交
  26. 24 3月, 2009 1 次提交
  27. 01 12月, 2008 1 次提交
  28. 16 9月, 2008 1 次提交
    • P
      powerpc: Make the 64-bit kernel as a position-independent executable · 549e8152
      Paul Mackerras 提交于
      This implements CONFIG_RELOCATABLE for 64-bit by making the kernel as
      a position-independent executable (PIE) when it is set.  This involves
      processing the dynamic relocations in the image in the early stages of
      booting, even if the kernel is being run at the address it is linked at,
      since the linker does not necessarily fill in words in the image for
      which there are dynamic relocations.  (In fact the linker does fill in
      such words for 64-bit executables, though not for 32-bit executables,
      so in principle we could avoid calling relocate() entirely when we're
      running a 64-bit kernel at the linked address.)
      
      The dynamic relocations are processed by a new function relocate(addr),
      where the addr parameter is the virtual address where the image will be
      run.  In fact we call it twice; once before calling prom_init, and again
      when starting the main kernel.  This means that reloc_offset() returns
      0 in prom_init (since it has been relocated to the address it is running
      at), which necessitated a few adjustments.
      
      This also changes __va and __pa to use an equivalent definition that is
      simpler.  With the relocatable kernel, PAGE_OFFSET and MEMORY_START are
      constants (for 64-bit) whereas PHYSICAL_START is a variable (and
      KERNELBASE ideally should be too, but isn't yet).
      
      With this, relocatable kernels still copy themselves down to physical
      address 0 and run there.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      549e8152
  29. 11 8月, 2008 1 次提交
    • B
      powerpc/mm: Fix attribute confusion with htab_bolt_mapping() · bc033b63
      Benjamin Herrenschmidt 提交于
      The function htab_bolt_mapping() is used to create permanent
      mappings in the MMU hash table, for example, in order to create
      the linear mapping of vmemmap.  It's also used by early boot
      ioremap (before mem_init_done).
      
      However, the way ioremap uses it is incorrect as it passes it the
      protection flags in the "linux PTE" form while htab_bolt_mapping()
      expects them in the hash table format.  This is made more confusing by
      the fact that some of those flags are actually in the same position in
      both cases.
      
      This fixes it all by making htab_bolt_mapping() take normal linux
      protection flags instead, and use a little helper to convert them to
      htab flags. Callers can now use the usual PAGE_* definitions safely.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      
       arch/powerpc/include/asm/mmu-hash64.h |    2 -
       arch/powerpc/mm/hash_utils_64.c       |   65 ++++++++++++++++++++--------------
       arch/powerpc/mm/init_64.c             |    9 +---
       3 files changed, 44 insertions(+), 32 deletions(-)
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      bc033b63
  30. 04 8月, 2008 1 次提交