1. 20 6月, 2016 2 次提交
    • M
      s390/mm: add shadow gmap support · 4be130a0
      Martin Schwidefsky 提交于
      For a nested KVM guest the outer KVM host needs to create shadow
      page tables for the nested guest. This patch adds the basic support
      to the guest address space (gmap) code.
      
      For each guest address space the inner KVM host creates, the first
      outer KVM host needs to create shadow page tables. The address space
      is identified by the ASCE loaded into the control register 1 at the
      time the inner SIE instruction for the second nested KVM guest is
      executed. The outer KVM host creates the shadow tables starting with
      the table identified by the ASCE on a on-demand basis. The outer KVM
      host will get repeated faults for all the shadow tables needed to
      run the second KVM guest.
      
      While a shadow page table for the second KVM guest is active the access
      to the origin region, segment and page tables needs to be restricted
      for the first KVM guest. For region and segment and page tables the first
      KVM guest may read the memory, but write attempt has to lead to an
      unshadow.  This is done using the page invalid and read-only bits in the
      page table of the first KVM guest. If the first guest re-accesses one of
      the origin pages of a shadow, it gets a fault and the affected parts of
      the shadow page table hierarchy needs to be removed again.
      
      PGSTE tables don't have to be shadowed, as all interpretation assist can't
      deal with the invalid bits in the shadow pte being set differently than
      the original ones provided by the first KVM guest.
      
      Many bug fixes and improvements by David Hildenbrand.
      Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      4be130a0
    • M
      s390/mm: use RCU for gmap notifier list and the per-mm gmap list · 8ecb1a59
      Martin Schwidefsky 提交于
      The gmap notifier list and the gmap list in the mm_struct change rarely.
      Use RCU to optimize the reader of these lists.
      Reviewed-by: NDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      8ecb1a59
  2. 21 4月, 2016 1 次提交
    • G
      s390/mm: fix asce_bits handling with dynamic pagetable levels · 723cacbd
      Gerald Schaefer 提交于
      There is a race with multi-threaded applications between context switch and
      pagetable upgrade. In switch_mm() a new user_asce is built from mm->pgd and
      mm->context.asce_bits, w/o holding any locks. A concurrent mmap with a
      pagetable upgrade on another thread in crst_table_upgrade() could already
      have set new asce_bits, but not yet the new mm->pgd. This would result in a
      corrupt user_asce in switch_mm(), and eventually in a kernel panic from a
      translation exception.
      
      Fix this by storing the complete asce instead of just the asce_bits, which
      can then be read atomically from switch_mm(), so that it either sees the
      old value or the new value, but no mixture. Both cases are OK. Having the
      old value would result in a page fault on access to the higher level memory,
      but the fault handler would see the new mm->pgd, if it was a valid access
      after the mmap on the other thread has completed. So as worst-case scenario
      we would have a page fault loop for the racing thread until the next time
      slice.
      
      Also remove dead code and simplify the upgrade/downgrade path, there are no
      upgrades from 2 levels, and only downgrades from 3 levels for compat tasks.
      There are also no concurrent upgrades, because the mmap_sem is held with
      down_write() in do_mmap, so the flush and table checks during upgrade can
      be removed.
      Reported-by: NMichael Munday <munday@ca.ibm.com>
      Reviewed-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      723cacbd
  3. 08 3月, 2016 1 次提交