1. 23 2月, 2009 8 次提交
  2. 13 2月, 2009 3 次提交
  3. 11 2月, 2009 2 次提交
    • B
      powerpc/mm: Rework I$/D$ coherency (v3) · 8d30c14c
      Benjamin Herrenschmidt 提交于
      This patch reworks the way we do I and D cache coherency on PowerPC.
      
      The "old" way was split in 3 different parts depending on the processor type:
      
         - Hash with per-page exec support (64-bit and >= POWER4 only) does it
      at hashing time, by preventing exec on unclean pages and cleaning pages
      on exec faults.
      
         - Everything without per-page exec support (32-bit hash, 8xx, and
      64-bit < POWER4) does it for all page going to user space in update_mmu_cache().
      
         - Embedded with per-page exec support does it from do_page_fault() on
      exec faults, in a way similar to what the hash code does.
      
      That leads to confusion, and bugs. For example, the method using update_mmu_cache()
      is racy on SMP where another processor can see the new PTE and hash it in before
      we have cleaned the cache, and then blow trying to execute. This is hard to hit but
      I think it has bitten us in the past.
      
      Also, it's inefficient for embedded where we always end up having to do at least
      one more page fault.
      
      This reworks the whole thing by moving the cache sync into two main call sites,
      though we keep different behaviours depending on the HW capability. The call
      sites are set_pte_at() which is now made out of line, and ptep_set_access_flags()
      which joins the former in pgtable.c
      
      The base idea for Embedded with per-page exec support, is that we now do the
      flush at set_pte_at() time when coming from an exec fault, which allows us
      to avoid the double fault problem completely (we can even improve the situation
      more by implementing TLB preload in update_mmu_cache() but that's for later).
      
      If for some reason we didn't do it there and we try to execute, we'll hit
      the page fault, which will do a minor fault, which will hit ptep_set_access_flags()
      to do things like update _PAGE_ACCESSED or _PAGE_DIRTY if needed, we just make
      this guys also perform the I/D cache sync for exec faults now. This second path
      is the catch all for things that weren't cleaned at set_pte_at() time.
      
      For cpus without per-pag exec support, we always do the sync at set_pte_at(),
      thus guaranteeing that when the PTE is visible to other processors, the cache
      is clean.
      
      For the 64-bit hash with per-page exec support case, we keep the old mechanism
      for now. I'll look into changing it later, once I've reworked a bit how we
      use _PAGE_EXEC.
      
      This is also a first step for adding _PAGE_EXEC support for embedded platforms
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      8d30c14c
    • M
  4. 29 1月, 2009 1 次提交
    • K
      powerpc/fsl-booke: Cleanup init/exception setup to be runtime · 105c31df
      Kumar Gala 提交于
      We currently have a few variants of fsl-booke processors (e500v1, e500v2,
      e500mc, and e200).  They all have minor differences that we had previously
      been handling via ifdefs.
      
      To move towards having this support the following changes have been made:
      
      * PID1, PID2 only exist on e500v1 & e500v2 and should not be accessed on
        e500mc or e200.  We use MMUCFG[NPIDS] to determine which case we are
        since we only touch PID1/2 in extremely early init code.
      
      * Not all IVORs exist on all the processors so introduce cpu_setup
        functions for each variant to setup the proper IVORs that are either
        unique or exist but have some variations between the processors
      Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
      105c31df
  5. 16 1月, 2009 1 次提交
  6. 15 1月, 2009 1 次提交
  7. 14 1月, 2009 1 次提交
  8. 13 1月, 2009 1 次提交
  9. 08 1月, 2009 6 次提交
  10. 07 1月, 2009 4 次提交
  11. 01 1月, 2009 1 次提交
  12. 31 12月, 2008 11 次提交
    • H
      KVM: ppc: mostly cosmetic updates to the exit timing accounting code · 7b701591
      Hollis Blanchard 提交于
      The only significant changes were to kvmppc_exit_timing_write() and
      kvmppc_exit_timing_show(), both of which were dramatically simplified.
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      7b701591
    • H
      KVM: ppc: Implement in-kernel exit timing statistics · 73e75b41
      Hollis Blanchard 提交于
      Existing KVM statistics are either just counters (kvm_stat) reported for
      KVM generally or trace based aproaches like kvm_trace.
      For KVM on powerpc we had the need to track the timings of the different exit
      types. While this could be achieved parsing data created with a kvm_trace
      extension this adds too much overhead (at least on embedded PowerPC) slowing
      down the workloads we wanted to measure.
      
      Therefore this patch adds a in-kernel exit timing statistic to the powerpc kvm
      code. These statistic is available per vm&vcpu under the kvm debugfs directory.
      As this statistic is low, but still some overhead it can be enabled via a
      .config entry and should be off by default.
      
      Since this patch touched all powerpc kvm_stat code anyway this code is now
      merged and simplified together with the exit timing statistic code (still
      working with exit timing disabled in .config).
      Signed-off-by: NChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      73e75b41
    • H
      KVM: ppc: save and restore guest mappings on context switch · c5fbdffb
      Hollis Blanchard 提交于
      Store shadow TLB entries in memory, but only use it on host context switch
      (instead of every guest entry). This improves performance for most workloads on
      440 by reducing the guest TLB miss rate.
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      c5fbdffb
    • H
      KVM: ppc: directly insert shadow mappings into the hardware TLB · 7924bd41
      Hollis Blanchard 提交于
      Formerly, we used to maintain a per-vcpu shadow TLB and on every entry to the
      guest would load this array into the hardware TLB. This consumed 1280 bytes of
      memory (64 entries of 16 bytes plus a struct page pointer each), and also
      required some assembly to loop over the array on every entry.
      
      Instead of saving a copy in memory, we can just store shadow mappings directly
      into the hardware TLB, accepting that the host kernel will clobber these as
      part of the normal 440 TLB round robin. When we do that we need less than half
      the memory, and we have decreased the exit handling time for all guest exits,
      at the cost of increased number of TLB misses because the host overwrites some
      guest entries.
      
      These savings will be increased on processors with larger TLBs or which
      implement intelligent flush instructions like tlbivax (which will avoid the
      need to walk arrays in software).
      
      In addition to that and to the code simplification, we have a greater chance of
      leaving other host userspace mappings in the TLB, instead of forcing all
      subsequent tasks to re-fault all their mappings.
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      7924bd41
    • H
      powerpc/44x: declare tlb_44x_index for use in C code · c0ca609c
      Hollis Blanchard 提交于
      KVM currently ignores the host's round robin TLB eviction selection, instead
      maintaining its own TLB state and its own round robin index. However, by
      participating in the normal 44x TLB selection, we can drop the alternate TLB
      processing in KVM. This results in a significant performance improvement,
      since that processing currently must be done on *every* guest exit.
      
      Accordingly, KVM needs to be able to access and increment tlb_44x_index.
      (KVM on 440 cannot be a module, so there is no need to export this symbol.)
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Acked-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      c0ca609c
    • H
      KVM: ppc: support large host pages · 89168618
      Hollis Blanchard 提交于
      KVM on 440 has always been able to handle large guest mappings with 4K host
      pages -- we must, since the guest kernel uses 256MB mappings.
      
      This patch makes KVM work when the host has large pages too (tested with 64K).
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      89168618
    • H
      KVM: ppc: fix userspace mapping invalidation on context switch · fe4e771d
      Hollis Blanchard 提交于
      We used to defer invalidating userspace TLB entries until jumping out of the
      kernel. This was causing MMU weirdness most easily triggered by using a pipe in
      the guest, e.g. "dmesg | tail". I believe the problem was that after the guest
      kernel changed the PID (part of context switch), the old process's mappings
      were still present, and so copy_to_user() on the "return to new process" path
      ended up using stale mappings.
      
      Testing with large pages (64K) exposed the problem, probably because with 4K
      pages, pressure on the TLB faulted all process A's mappings out before the
      guest kernel could insert any for process B.
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      fe4e771d
    • H
      KVM: ppc: optimize irq delivery path · d4cf3892
      Hollis Blanchard 提交于
      In kvmppc_deliver_interrupt is just one case left in the switch and it is a
      rare one (less than 8%) when looking at the exit numbers. Therefore we can
      at least drop the switch/case and if an if. I inserted an unlikely too, but
      that's open for discussion.
      
      In kvmppc_can_deliver_interrupt all frequent cases are in the default case.
      I know compilers are smart but we can make it easier for them. By writing
      down all options and removing the default case combined with the fact that
      ithe values are constants 0..15 should allow the compiler to write an easy
      jump table.
      Modifying kvmppc_can_deliver_interrupt pointed me to the fact that gcc seems
      to be unable to reduce priority_exception[x] to a build time constant.
      Therefore I changed the usage of the translation arrays in the interrupt
      delivery path completely. It is now using priority without translation to irq
      on the full irq delivery path.
      To be able to do that ivpr regs are stored by their priority now.
      
      Additionally the decision made in kvmppc_can_deliver_interrupt is already
      sufficient to get the value of interrupt_msr_mask[x]. Therefore we can replace
      the 16x4byte array used here with a single 4byte variable (might still be one
      miss, but the chance to find this in cache should be better than the right
      entry of the whole array).
      Signed-off-by: NChristian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      d4cf3892
    • H
      KVM: ppc: adjust vcpu types to support 64-bit cores · 5cf8ca22
      Hollis Blanchard 提交于
      However, some of these fields could be split into separate per-core structures
      in the future.
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      5cf8ca22
    • H
      KVM: ppc: create struct kvm_vcpu_44x and introduce container_of() accessor · db93f574
      Hollis Blanchard 提交于
      This patch doesn't yet move all 44x-specific data into the new structure, but
      is the first step down that path. In the future we may also want to create a
      struct kvm_vcpu_booke.
      
      Based on patch from Liu Yu <yu.liu@freescale.com>.
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      db93f574
    • H
      KVM: ppc: Move the last bits of 44x code out of booke.c · 5cbb5106
      Hollis Blanchard 提交于
      Needed to port to other Book E processors.
      Signed-off-by: NHollis Blanchard <hollisb@us.ibm.com>
      Signed-off-by: NAvi Kivity <avi@redhat.com>
      5cbb5106