1. 02 8月, 2014 2 次提交
  2. 01 8月, 2014 2 次提交
    • M
      arm64: add newline to I-cache policy string · ea171967
      Mark Rutland 提交于
      Due to a missing newline in the I-cache policy detection log output,
      it's possible to get some ratehr unfortunate output at boot time:
      
      CPU1: Booted secondary processor
      Detected VIPT I-cache on CPU1CPU2: Booted secondary processor
      Detected VIPT I-cache on CPU2CPU3: Booted secondary processor
      Detected VIPT I-cache on CPU3CPU4: Booted secondary processor
      Detected PIPT I-cache on CPU4CPU5: Booted secondary processor
      Detected PIPT I-cache on CPU5Brought up 6 CPUs
      SMP: Total of 6 processors activated.
      
      This patch adds the missing newline to the format string, cleaning up
      the output.
      
      Fixes: 59ccc0d4 ("arm64: cachetype: report weakest cache policy")
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      ea171967
    • M
      arm64: KVM: fix 64bit CP15 VM access for 32bit guests · dedf97e8
      Marc Zyngier 提交于
      Commit f0a3eaff (ARM64: KVM: fix big endian issue in
      access_vm_reg for 32bit guest) changed the way we handle CP15
      VM accesses, so that all 64bit accesses are done via vcpu_sys_reg.
      
      This looks like a good idea as it solves indianness issues in an
      elegant way, except for one small detail: the register index is
      doesn't refer to the same array! We end up corrupting some random
      data structure instead.
      
      Fix this by reverting to the original code, except for the introduction
      of a vcpu_cp15_64_high macro that deals with the endianness thing.
      
      Tested on Juno with 32bit SMP guests.
      
      Cc: Victor Kamensky <victor.kamensky@linaro.org>
      Reviewed-by: NChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NChristoffer Dall <christoffer.dall@linaro.org>
      dedf97e8
  3. 31 7月, 2014 8 次提交
  4. 30 7月, 2014 8 次提交
  5. 29 7月, 2014 8 次提交
  6. 28 7月, 2014 12 次提交
    • A
      KVM: PPC: Handle magic page in kvmppc_ld/st · c12fb43c
      Alexander Graf 提交于
      We use kvmppc_ld and kvmppc_st to emulate load/store instructions that may as
      well access the magic page. Special case it out so that we can properly access
      it.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      c12fb43c
    • A
      KVM: PPC: Use kvm_read_guest in kvmppc_ld · c45c5514
      Alexander Graf 提交于
      We have a nice and handy helper to read from guest physical address space,
      so we should make use of it in kvmppc_ld as we already do for its counterpart
      in kvmppc_st.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      c45c5514
    • A
      KVM: PPC: Remove kvmppc_bad_hva() · 9897e88a
      Alexander Graf 提交于
      We have a proper define for invalid HVA numbers. Use those instead of the
      ppc specific kvmppc_bad_hva().
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9897e88a
    • A
      KVM: PPC: Move kvmppc_ld/st to common code · 35c4a733
      Alexander Graf 提交于
      We have enough common infrastructure now to resolve GVA->GPA mappings at
      runtime. With this we can move our book3s specific helpers to load / store
      in guest virtual address space to common code as well.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      35c4a733
    • A
      KVM: PPC: Implement kvmppc_xlate for all targets · 7d15c06f
      Alexander Graf 提交于
      We have a nice API to find the translated GPAs of a GVA including protection
      flags. So far we only use it on Book3S, but there's no reason the same shouldn't
      be used on BookE as well.
      
      Implement a kvmppc_xlate() version for BookE and clean it up to make it more
      readable in general.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      7d15c06f
    • A
      KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page · 63fff5c1
      Aneesh Kumar K.V 提交于
      When calculating the lower bits of AVA field, use the shift
      count based on the base page size. Also add the missing segment
      size and remove stale comment.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      63fff5c1
    • M
      crypto: arm-aes - fix encryption of unaligned data · f3c400ef
      Mikulas Patocka 提交于
      Fix the same alignment bug as in arm64 - we need to pass residue
      unprocessed bytes as the last argument to blkcipher_walk_done.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org	# 3.13+
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      f3c400ef
    • M
      crypto: arm64-aes - fix encryption of unaligned data · f960d209
      Mikulas Patocka 提交于
      cryptsetup fails on arm64 when using kernel encryption via AF_ALG socket.
      See https://bugzilla.redhat.com/show_bug.cgi?id=1122937
      
      The bug is caused by incorrect handling of unaligned data in
      arch/arm64/crypto/aes-glue.c. Cryptsetup creates a buffer that is aligned
      on 8 bytes, but not on 16 bytes. It opens AF_ALG socket and uses the
      socket to encrypt data in the buffer. The arm64 crypto accelerator causes
      data corruption or crashes in the scatterwalk_pagedone.
      
      This patch fixes the bug by passing the residue bytes that were not
      processed as the last parameter to blkcipher_walk_done.
      Signed-off-by: NMikulas Patocka <mpatocka@redhat.com>
      Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      f960d209
    • A
      KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode · 7a58777a
      Alexander Graf 提交于
      With Book3S KVM we can create both PR and HV VMs in parallel on the same
      machine. That gives us new challenges on the CAPs we return - both have
      different capabilities.
      
      When we get asked about CAPs on the kvm fd, there's nothing we can do. We
      can try to be smart and assume we're running HV if HV is available, PR
      otherwise. However with the newly added VM CHECK_EXTENSION we can now ask
      for capabilities directly on a VM which knows whether it's PR or HV.
      
      With this patch I can successfully expose KVM PVINFO data to user space
      in the PR case, fixing magic page mapping for PAPR guests.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      7a58777a
    • A
      KVM: Rename and add argument to check_extension · 784aa3d7
      Alexander Graf 提交于
      In preparation to make the check_extension function available to VM scope
      we add a struct kvm * argument to the function header and rename the function
      accordingly. It will still be called from the /dev/kvm fd, but with a NULL
      argument for struct kvm *.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      Acked-by: NPaolo Bonzini <pbonzini@redhat.com>
      784aa3d7
    • S
      Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8 · 9678cdaa
      Stewart Smith 提交于
      The POWER8 processor has a Micro Partition Prefetch Engine, which is
      a fancy way of saying "has way to store and load contents of L2 or
      L2+MRU way of L3 cache". We initiate the storing of the log (list of
      addresses) using the logmpp instruction and start restore by writing
      to a SPR.
      
      The logmpp instruction takes parameters in a single 64bit register:
      - starting address of the table to store log of L2/L2+L3 cache contents
        - 32kb for L2
        - 128kb for L2+L3
        - Aligned relative to maximum size of the table (32kb or 128kb)
      - Log control (no-op, L2 only, L2 and L3, abort logout)
      
      We should abort any ongoing logging before initiating one.
      
      To initiate restore, we write to the MPPR SPR. The format of what to write
      to the SPR is similar to the logmpp instruction parameter:
      - starting address of the table to read from (same alignment requirements)
      - table size (no data, until end of table)
      - prefetch rate (from fastest possible to slower. about every 8, 16, 24 or
        32 cycles)
      
      The idea behind loading and storing the contents of L2/L3 cache is to
      reduce memory latency in a system that is frequently swapping vcores on
      a physical CPU.
      
      The best case scenario for doing this is when some vcores are doing very
      cache heavy workloads. The worst case is when they have about 0 cache hits,
      so we just generate needless memory operations.
      
      This implementation just does L2 store/load. In my benchmarks this proves
      to be useful.
      
      Benchmark 1:
       - 16 core POWER8
       - 3x Ubuntu 14.04LTS guests (LE) with 8 VCPUs each
       - No split core/SMT
       - two guests running sysbench memory test.
         sysbench --test=memory --num-threads=8 run
       - one guest running apache bench (of default HTML page)
         ab -n 490000 -c 400 http://localhost/
      
      This benchmark aims to measure performance of real world application (apache)
      where other guests are cache hot with their own workloads. The sysbench memory
      benchmark does pointer sized writes to a (small) memory buffer in a loop.
      
      In this benchmark with this patch I can see an improvement both in requests
      per second (~5%) and in mean and median response times (again, about 5%).
      The spread of minimum and maximum response times were largely unchanged.
      
      benchmark 2:
       - Same VM config as benchmark 1
       - all three guests running sysbench memory benchmark
      
      This benchmark aims to see if there is a positive or negative affect to this
      cache heavy benchmark. Although due to the nature of the benchmark (stores) we
      may not see a difference in performance, but rather hopefully an improvement
      in consistency of performance (when vcore switched in, don't have to wait
      many times for cachelines to be pulled in)
      
      The results of this benchmark are improvements in consistency of performance
      rather than performance itself. With this patch, the few outliers in duration
      go away and we get more consistent performance in each guest.
      
      benchmark 3:
       - same 3 guests and CPU configuration as benchmark 1 and 2.
       - two idle guests
       - 1 guest running STREAM benchmark
      
      This scenario also saw performance improvement with this patch. On Copy and
      Scale workloads from STREAM, I got 5-6% improvement with this patch. For
      Add and triad, it was around 10% (or more).
      
      benchmark 4:
       - same 3 guests as previous benchmarks
       - two guests running sysbench --memory, distinctly different cache heavy
         workload
       - one guest running STREAM benchmark.
      
      Similar improvements to benchmark 3.
      
      benchmark 5:
       - 1 guest, 8 VCPUs, Ubuntu 14.04
       - Host configured with split core (SMT8, subcores-per-core=4)
       - STREAM benchmark
      
      In this benchmark, we see a 10-20% performance improvement across the board
      of STREAM benchmark results with this patch.
      
      Based on preliminary investigation and microbenchmarks
      by Prerna Saxena <prerna@linux.vnet.ibm.com>
      Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      9678cdaa
    • S
      Split out struct kvmppc_vcore creation to separate function · de9bdd1a
      Stewart Smith 提交于
      No code changes, just split it out to a function so that with the addition
      of micro partition prefetch buffer allocation (in subsequent patch) looks
      neater and doesn't require excessive indentation.
      Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
      Acked-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      de9bdd1a