1. 15 12月, 2014 4 次提交
    • P
      KVM: PPC: Book3S HV: Fix KSM memory corruption · b4a83900
      Paul Mackerras 提交于
      Testing with KSM active in the host showed occasional corruption of
      guest memory.  Typically a page that should have contained zeroes
      would contain values that look like the contents of a user process
      stack (values such as 0x0000_3fff_xxxx_xxx).
      
      Code inspection in kvmppc_h_protect revealed that there was a race
      condition with the possibility of granting write access to a page
      which is read-only in the host page tables.  The code attempts to keep
      the host mapping read-only if the host userspace PTE is read-only, but
      if that PTE had been temporarily made invalid for any reason, the
      read-only check would not trigger and the host HPTE could end up
      read-write.  Examination of the guest HPT in the failure situation
      revealed that there were indeed shared pages which should have been
      read-only that were mapped read-write.
      
      To close this race, we don't let a page go from being read-only to
      being read-write, as far as the real HPTE mapping the page is
      concerned (the guest view can go to read-write, but the actual mapping
      stays read-only).  When the guest tries to write to the page, we take
      an HDSI and let kvmppc_book3s_hv_page_fault take care of providing a
      writable HPTE for the page.
      
      This eliminates the occasional corruption of shared pages
      that was previously seen with KSM active.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b4a83900
    • M
      KVM: PPC: Book3S HV: Fix an issue where guest is paused on receiving HMI · dee6f24c
      Mahesh Salgaonkar 提交于
      When we get an HMI (hypervisor maintenance interrupt) while in a
      guest, we see that guest enters into paused state.  The reason is, in
      kvmppc_handle_exit_hv it falls through default path and returns to
      host instead of resuming guest.  This causes guest to enter into
      paused state.  HMI is a hypervisor only interrupt and it is safe to
      resume the guest since the host has handled it already.  This patch
      adds a switch case to resume the guest.
      
      Without this patch we see guest entering into paused state with following
      console messages:
      
      [ 3003.329351] Severe Hypervisor Maintenance interrupt [Recovered]
      [ 3003.329356]  Error detail: Timer facility experienced an error
      [ 3003.329359] 	HMER: 0840000000000000
      [ 3003.329360] 	TFMR: 4a12000980a84000
      [ 3003.329366] vcpu c0000007c35094c0 (40):
      [ 3003.329368] pc  = c0000000000c2ba0  msr = 8000000000009032  trap = e60
      [ 3003.329370] r 0 = c00000000021ddc0  r16 = 0000000000000046
      [ 3003.329372] r 1 = c00000007a02bbd0  r17 = 00003ffff27d5d98
      [ 3003.329375] r 2 = c0000000010980b8  r18 = 00001fffffc9a0b0
      [ 3003.329377] r 3 = c00000000142d6b8  r19 = c00000000142d6b8
      [ 3003.329379] r 4 = 0000000000000002  r20 = 0000000000000000
      [ 3003.329381] r 5 = c00000000524a110  r21 = 0000000000000000
      [ 3003.329383] r 6 = 0000000000000001  r22 = 0000000000000000
      [ 3003.329386] r 7 = 0000000000000000  r23 = c00000000524a110
      [ 3003.329388] r 8 = 0000000000000000  r24 = 0000000000000001
      [ 3003.329391] r 9 = 0000000000000001  r25 = c00000007c31da38
      [ 3003.329393] r10 = c0000000014280b8  r26 = 0000000000000002
      [ 3003.329395] r11 = 746f6f6c2f68656c  r27 = c00000000524a110
      [ 3003.329397] r12 = 0000000028004484  r28 = c00000007c31da38
      [ 3003.329399] r13 = c00000000fe01400  r29 = 0000000000000002
      [ 3003.329401] r14 = 0000000000000046  r30 = c000000003011e00
      [ 3003.329403] r15 = ffffffffffffffba  r31 = 0000000000000002
      [ 3003.329404] ctr = c00000000041a670  lr  = c000000000272520
      [ 3003.329405] srr0 = c00000000007e8d8 srr1 = 9000000000001002
      [ 3003.329406] sprg0 = 0000000000000000 sprg1 = c00000000fe01400
      [ 3003.329407] sprg2 = c00000000fe01400 sprg3 = 0000000000000005
      [ 3003.329408] cr = 48004482  xer = 2000000000000000  dsisr = 42000000
      [ 3003.329409] dar = 0000010015020048
      [ 3003.329410] fault dar = 0000010015020048 dsisr = 42000000
      [ 3003.329411] SLB (8 entries):
      [ 3003.329412]   ESID = c000000008000000 VSID = 40016e7779000510
      [ 3003.329413]   ESID = d000000008000001 VSID = 400142add1000510
      [ 3003.329414]   ESID = f000000008000004 VSID = 4000eb1a81000510
      [ 3003.329415]   ESID = 00001f000800000b VSID = 40004fda0a000d90
      [ 3003.329416]   ESID = 00003f000800000c VSID = 400039f536000d90
      [ 3003.329417]   ESID = 000000001800000d VSID = 0001251b35150d90
      [ 3003.329417]   ESID = 000001000800000e VSID = 4001e46090000d90
      [ 3003.329418]   ESID = d000080008000019 VSID = 40013d349c000400
      [ 3003.329419] lpcr = c048800001847001 sdr1 = 0000001b19000006 last_inst = ffffffff
      [ 3003.329421] trap=0xe60 | pc=0xc0000000000c2ba0 | msr=0x8000000000009032
      [ 3003.329524] Severe Hypervisor Maintenance interrupt [Recovered]
      [ 3003.329526]  Error detail: Timer facility experienced an error
      [ 3003.329527] 	HMER: 0840000000000000
      [ 3003.329527] 	TFMR: 4a12000980a94000
      [ 3006.359786] Severe Hypervisor Maintenance interrupt [Recovered]
      [ 3006.359792]  Error detail: Timer facility experienced an error
      [ 3006.359795] 	HMER: 0840000000000000
      [ 3006.359797] 	TFMR: 4a12000980a84000
      
       Id    Name                           State
      ----------------------------------------------------
       2     guest2                         running
       3     guest3                         paused
       4     guest4                         running
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      dee6f24c
    • A
      KVM: PPC: Book3S HV: Add missing HPTE unlock · f6fb9e84
      Aneesh Kumar K.V 提交于
      In kvm_test_clear_dirty(), if we find an invalid HPTE we move on to the
      next HPTE without unlocking the invalid one.  In fact we should never
      find an invalid and unlocked HPTE in the rmap chain, but for robustness
      we should unlock it.  This adds the missing unlock.
      Reported-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      f6fb9e84
    • A
      KVM: PPC: BookE: Improve irq inject tracepoint · b6b61257
      Alexander Graf 提交于
      When injecting an IRQ, we only document which IRQ priority (which translates
      to IRQ type) gets injected. However, when reading traces you don't necessarily
      have all the numbers in your head to know which IRQ really is meant.
      
      This patch converts the IRQ number field to a symbolic name that is in sync
      with the respective define. That way it's a lot easier for readers to figure
      out what interrupt gets injected.
      Signed-off-by: NAlexander Graf <agraf@suse.de>
      b6b61257
  2. 29 9月, 2014 1 次提交
  3. 24 9月, 2014 1 次提交
    • A
      kvm: Fix page ageing bugs · 57128468
      Andres Lagar-Cavilla 提交于
      1. We were calling clear_flush_young_notify in unmap_one, but we are
      within an mmu notifier invalidate range scope. The spte exists no more
      (due to range_start) and the accessed bit info has already been
      propagated (due to kvm_pfn_set_accessed). Simply call
      clear_flush_young.
      
      2. We clear_flush_young on a primary MMU PMD, but this may be mapped
      as a collection of PTEs by the secondary MMU (e.g. during log-dirty).
      This required expanding the interface of the clear_flush_young mmu
      notifier, so a lot of code has been trivially touched.
      
      3. In the absence of shadow_accessed_mask (e.g. EPT A bit), we emulate
      the access bit by blowing the spte. This requires proper synchronizing
      with MMU notifier consumers, like every other removal of spte's does.
      Signed-off-by: NAndres Lagar-Cavilla <andreslc@google.com>
      Acked-by: NRik van Riel <riel@redhat.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      57128468
  4. 22 9月, 2014 20 次提交
  5. 03 9月, 2014 1 次提交
    • L
      powerpc/kvm/cma: Fix panic introduces by signed shift operation · 02a68d05
      Laurent Dufour 提交于
      fc95ca72 introduces a memset in
      kvmppc_alloc_hpt since the general CMA doesn't clear the memory it
      allocates.
      
      However, the size argument passed to memset is computed from a signed value
      and its signed bit is extended by the cast the compiler is doing. This lead
      to extremely large size value when dealing with order value >= 31, and
      almost all the memory following the allocated space is cleaned. As a
      consequence, the system is panicing and may even fail spawning the kdump
      kernel.
      
      This fix makes use of an unsigned value for the memset's size argument to
      avoid sign extension. Among this fix, another shift operation which may
      lead to signed extended value too is also fixed.
      
      Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NLaurent Dufour <ldufour@linux.vnet.ibm.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      02a68d05
  6. 29 8月, 2014 2 次提交
  7. 27 8月, 2014 2 次提交
    • T
      Revert "powerpc: Replace __get_cpu_var uses" · 23f66e2d
      Tejun Heo 提交于
      This reverts commit 5828f666 due to
      build failure after merging with pending powerpc changes.
      
      Link: http://lkml.kernel.org/g/20140827142243.6277eaff@canb.auug.org.auSigned-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      23f66e2d
    • C
      powerpc: Replace __get_cpu_var uses · 5828f666
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of
      them is address calculation via the form &__get_cpu_var(x).  This calculates
      the address for the instance of the percpu variable of the current processor
      based on an offset.
      
      Other use cases are for storing and retrieving data from the current
      processors percpu area.  __get_cpu_var() can be used as an lvalue when
      writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store
      and retrieve operations could use a segment prefix (or global register on
      other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a
      percpu area and use optimized assembly code to read and write per cpu
      variables.
      
      This patch converts __get_cpu_var into either an explicit address
      calculation using this_cpu_ptr() or into a use of this_cpu operations that
      use the offset.  Thereby address calculations are avoided and less registers
      are used when code is generated.
      
      At the end of the patch set all uses of __get_cpu_var have been removed so
      the macro is removed too.
      
      The patch set includes passes over all arches as well. Once these operations
      are used throughout then specialized macros can be defined in non -x86
      arches as well in order to optimize per cpu access by f.e.  using a global
      register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu
      variable.
      
      	DEFINE_PER_CPU(int, y);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(&x, this_cpu_ptr(&y), sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	__this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	__this_cpu_inc(y)
      
      tj: Folded a fix patch.
          http://lkml.kernel.org/g/alpine.DEB.2.11.1408172143020.9652@gentwo.org
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      CC: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      5828f666
  8. 22 8月, 2014 1 次提交
  9. 19 8月, 2014 1 次提交
  10. 07 8月, 2014 2 次提交
  11. 05 8月, 2014 5 次提交