1. 08 9月, 2014 6 次提交
    • G
      spapr_pci: map the MSI window in each PHB · 8c46f7ec
      Greg Kurz 提交于
      On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
      Commit cc943c36 has modified MSI-X
      so that writes are made using the bus master address space and follow
      the IOMMU path.
      
      Unfortunately, the IOMMU address space address space does not have an
      MSI window: the notification is silently dropped in unassigned_mem_write
      instead of reaching the guest... The most visible effect is that all
      virtio devices are non-functional on sPAPR since then. :(
      
      This patch does the following:
      1) map the MSI window into the IOMMU address space for each PHB
         - since each PHB instantiates its own IOMMU address space, we
           can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
         - no real need to keep the MSI window setup in a separate function,
           the spapr_pci_msi_init() code moves to spapr_phb_realize().
      
      2) kill the global MSI window as it is not needed in the end
      Signed-off-by: NGreg Kurz <gkurz@linux.vnet.ibm.com>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      8c46f7ec
    • A
      KVM: Add helper to run KVM_CHECK_EXTENSION on vm fd · 7d0a07fa
      Alexander Graf 提交于
      We now can call KVM_CHECK_EXTENSION on the kvm fd or on the vm fd, whereas
      the vm version is more accurate when it comes to PPC KVM.
      
      Add a helper to make the vm version available that falls back to the non-vm
      variant if the vm one is not available yet to stay compatible.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      7d0a07fa
    • B
      spapr: Locate RTAS and device-tree based on real RMA · b7d1f77a
      Benjamin Herrenschmidt 提交于
      We currently calculate the final RTAS and FDT location based on
      the early estimate of the RMA size, cropped to 256M on KVM since
      we only know the real RMA size at reset time which happens much
      later in the boot process.
      
      This means the FDT and RTAS end up right below 256M while they
      could be much higher, using precious RMA space and limiting
      what the OS bootloader can put there which has proved to be
      a problem with some OSes (such as when using very large initrd's)
      
      Fortunately, we do the actual copy of the device-tree into guest
      memory much later, during reset, late enough to be able to do it
      using the final RMA value, we just need to move the calculation
      to the right place.
      
      However, RTAS is still loaded too early, so we change the code to
      load the tiny blob into qemu memory early on, and then copy it into
      guest memory at reset time. It's small enough that the memory usage
      doesn't matter.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      [aik: fixed errors from checkpatch.pl, defined RTAS_MAX_ADDR]
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      [agraf: fix compilation on 32bit hosts]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b7d1f77a
    • B
      loader: Add load_image_size() to replace load_image() · ea87616d
      Benjamin Herrenschmidt 提交于
      A subsequent patch to ppc/spapr needs to load the RTAS blob into
      qemu memory rather than target memory (so it can later be copied
      into the right spot at machine reset time).
      
      I would use load_image() but it is marked deprecated because it
      doesn't take a buffer size as argument, so let's add load_image_size()
      that does.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      [aik: fixed errors from checkpatch.pl]
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      ea87616d
    • A
      PPC: mac99: Move NVRAM to page boundary when necessary · 261265cc
      Alexander Graf 提交于
      When running KVM we have to adhere to host page boundaries for memory slots.
      Unfortunately the NVRAM on mac99 is a 4k RAM hole inside of an MMIO flash
      area.
      
      So if our host is configured with 64k page size, we can't use the mac99 target
      with KVM. This is a real shame, as this limitation is not really an issue - we
      can easily map NVRAM somewhere else and at least Linux and Mac OS X use it
      at their new location.
      
      So in that emergency case when it's about failing to run at all and moving NVRAM
      to a place it shouldn't be at, choose the latter.
      
      This patch enables -M mac99 with KVM on 64k page size hosts.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      261265cc
    • N
      ppc: spapr-rtas - implement os-term rtas call · 2e14072f
      Nikunj A Dadhania 提交于
      PAPR compliant guest calls this in absence of kdump. This finally
      reaches the guest and can be handled according to the policies set by
      higher level tools(like taking dump) for further analysis by tools like
      crash.
      
      Linux kernel calls ibm,os-term when extended property of os-term is set.
      This makes sure that a return to the linux kernel is gauranteed.
      Signed-off-by: NNikunj A Dadhania <nikunj@linux.vnet.ibm.com>
      [agraf: reduce RTAS_TOKEN_MAX]
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      2e14072f
  2. 02 9月, 2014 1 次提交
    • X
      implementing victim TLB for QEMU system emulated TLB · 88e89a57
      Xin Tong 提交于
      QEMU system mode page table walks are expensive. Taken by running QEMU
      qemu-system-x86_64 system mode on Intel PIN , a TLB miss and walking a
      4-level page tables in guest Linux OS takes ~450 X86 instructions on
      average.
      
      QEMU system mode TLB is implemented using a directly-mapped hashtable.
      This structure suffers from conflict misses. Increasing the
      associativity of the TLB may not be the solution to conflict misses as
      all the ways may have to be walked in serial.
      
      A victim TLB is a TLB used to hold translations evicted from the
      primary TLB upon replacement. The victim TLB lies between the main TLB
      and its refill path. Victim TLB is of greater associativity (fully
      associative in this patch). It takes longer to lookup the victim TLB,
      but its likely better than a full page table walk. The memory
      translation path is changed as follows :
      
      Before Victim TLB:
      1. Inline TLB lookup
      2. Exit code cache on TLB miss.
      3. Check for unaligned, IO accesses
      4. TLB refill.
      5. Do the memory access.
      6. Return to code cache.
      
      After Victim TLB:
      1. Inline TLB lookup
      2. Exit code cache on TLB miss.
      3. Check for unaligned, IO accesses
      4. Victim TLB lookup.
      5. If victim TLB misses, TLB refill
      6. Do the memory access.
      7. Return to code cache
      
      The advantage is that victim TLB can offer more associativity to a
      directly mapped TLB and thus potentially fewer page table walks while
      still keeping the time taken to flush within reasonable limits.
      However, placing a victim TLB before the refill path increase TLB
      refill path as the victim TLB is consulted before the TLB refill. The
      performance results demonstrate that the pros outweigh the cons.
      
      some performance results taken on SPECINT2006 train
      datasets and kernel boot and qemu configure script on an
      Intel(R) Xeon(R) CPU  E5620  @ 2.40GHz Linux machine are shown in the
      Google Doc link below.
      
      https://docs.google.com/spreadsheets/d/1eiItzekZwNQOal_h-5iJmC4tMDi051m9qidi5_nwvH4/edit?usp=sharing
      
      In summary, victim TLB improves the performance of qemu-system-x86_64 by
      11% on average on SPECINT2006, kernelboot and qemu configscript and with
      highest improvement of in 26% in 456.hmmer. And victim TLB does not result
      in any performance degradation in any of the measured benchmarks. Furthermore,
      the implemented victim TLB is architecture independent and is expected to
      benefit other architectures in QEMU as well.
      
      Although there are measurement fluctuations, the performance
      improvement is very significant and by no means in the range of
      noises.
      Signed-off-by: NXin Tong <trent.tong@gmail.com>
      Message-id: 1407202523-23553-1-git-send-email-trent.tong@gmail.com
      Reviewed-by: NPeter Maydell <peter.maydell@linaro.org>
      Signed-off-by: NPeter Maydell <peter.maydell@linaro.org>
      88e89a57
  3. 01 9月, 2014 3 次提交
  4. 29 8月, 2014 12 次提交
  5. 26 8月, 2014 2 次提交
  6. 25 8月, 2014 4 次提交
  7. 22 8月, 2014 1 次提交
  8. 20 8月, 2014 4 次提交
  9. 19 8月, 2014 2 次提交
  10. 18 8月, 2014 4 次提交
  11. 15 8月, 2014 1 次提交