1. 01 3月, 2015 1 次提交
    • K
      mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines · c07af4f1
      Kirill A. Shutemov 提交于
      Core mm expects __PAGETABLE_{PUD,PMD}_FOLDED to be defined if these page
      table levels folded.  Usually, these defines are provided by
      <asm-generic/pgtable-nopmd.h> and <asm-generic/pgtable-nopud.h>.
      
      But some architectures fold page table levels in a custom way.  They
      need to define these macros themself.  This patch adds missing defines.
      
      The patch fixes mm->nr_pmds underflow and eliminates dead __pmd_alloc()
      and __pud_alloc() on architectures without these page table levels.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c07af4f1
  2. 12 2月, 2015 1 次提交
  3. 11 2月, 2015 1 次提交
  4. 22 1月, 2015 1 次提交
  5. 27 10月, 2014 5 次提交
    • M
      s390/mm: pmdp_get_and_clear_full optimization · fcbe08d6
      Martin Schwidefsky 提交于
      Analog to ptep_get_and_clear_full define a variant of the
      pmpd_get_and_clear primitive which gets the full hint from the
      mmu_gather struct. This allows s390 to avoid a costly instruction
      when destroying an address space.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      fcbe08d6
    • H
      s390/ftrace,kprobes: allow to patch first instruction · c933146a
      Heiko Carstens 提交于
      If the function tracer is enabled, allow to set kprobes on the first
      instruction of a function (which is the function trace caller):
      
      If no kprobe is set handling of enabling and disabling function tracing
      of a function simply patches the first instruction. Either it is a nop
      (right now it's an unconditional branch, which skips the mcount block),
      or it's a branch to the ftrace_caller() function.
      
      If a kprobe is being placed on a function tracer calling instruction
      we encode if we actually have a nop or branch in the remaining bytes
      after the breakpoint instruction (illegal opcode).
      This is possible, since the size of the instruction used for the nop
      and branch is six bytes, while the size of the breakpoint is only
      two bytes.
      Therefore the first two bytes contain the illegal opcode and the last
      four bytes contain either "0" for nop or "1" for branch. The kprobes
      code will then execute/simulate the correct instruction.
      
      Instruction patching for kprobes and function tracer is always done
      with stop_machine(). Therefore we don't have any races where an
      instruction is patched concurrently on a different cpu.
      Besides that also the program check handler which executes the function
      trace caller instruction won't be executed concurrently to any
      stop_machine() execution.
      
      This allows to keep full fault based kprobes handling which generates
      correct pt_regs contents automatically.
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      c933146a
    • D
      s390/mm: disable KSM for storage key enabled pages · 3ac8e380
      Dominik Dingel 提交于
      When storage keys are enabled unmerge already merged pages and prevent
      new pages from being merged.
      Signed-off-by: NDominik Dingel <dingel@linux.vnet.ibm.com>
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      3ac8e380
    • D
      s390/mm: prevent and break zero page mappings in case of storage keys · 2faee8ff
      Dominik Dingel 提交于
      As soon as storage keys are enabled we need to stop working on zero page
      mappings to prevent inconsistencies between storage keys and pgste.
      
      Otherwise following data corruption could happen:
      1) guest enables storage key
      2) guest sets storage key for not mapped page X
         -> change goes to PGSTE
      3) guest reads from page X
         -> as X was not dirty before, the page will be zero page backed,
            storage key from PGSTE for X will go to storage key for zero page
      4) guest sets storage key for not mapped page Y (same logic as above
      5) guest reads from page Y
         -> as Y was not dirty before, the page will be zero page backed,
            storage key from PGSTE for Y will got to storage key for zero page
            overwriting storage key for X
      
      While holding the mmap sem, we are safe against changes on entries we
      already fixed, as every fault would need to take the mmap_sem (read).
      
      Other vCPUs executing storage key instructions will get a one time interception
      and be serialized also with mmap_sem.
      Signed-off-by: NDominik Dingel <dingel@linux.vnet.ibm.com>
      Reviewed-by: NPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      2faee8ff
    • D
      s390/mm: recfactor global pgste updates · a13cff31
      Dominik Dingel 提交于
      Replace the s390 specific page table walker for the pgste updates
      with a call to the common code walk_page_range function.
      There are now two pte modification functions, one for the reset
      of the CMMA state and another one for the initialization of the
      storage keys.
      Signed-off-by: NDominik Dingel <dingel@linux.vnet.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      a13cff31
  6. 30 9月, 2014 1 次提交
  7. 25 9月, 2014 1 次提交
  8. 02 9月, 2014 2 次提交
  9. 26 8月, 2014 3 次提交
  10. 25 8月, 2014 3 次提交
  11. 01 8月, 2014 1 次提交
    • M
      s390/mm: implement dirty bits for large segment table entries · 152125b7
      Martin Schwidefsky 提交于
      The large segment table entry format has block of bits for the
      ACC/F values for the large page. These bits are valid only if
      another bit (AV bit 0x10000) of the segment table entry is set.
      The ACC/F bits do not have a meaning if the AV bit is off.
      This allows to put the THP splitting bit, the segment young bit
      and the new segment dirty bit into the ACC/F bits as long as
      the AV bit stays off. The dirty and young information is only
      available if the pmd is large.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      152125b7
  12. 22 4月, 2014 4 次提交
  13. 03 4月, 2014 1 次提交
    • M
      s390/mm,tlb: optimize TLB flushing for zEC12 · 1b948d6c
      Martin Schwidefsky 提交于
      The zEC12 machines introduced the local-clearing control for the IDTE
      and IPTE instruction. If the control is set only the TLB of the local
      CPU is cleared of entries, either all entries of a single address space
      for IDTE, or the entry for a single page-table entry for IPTE.
      Without the local-clearing control the TLB flush is broadcasted to all
      CPUs in the configuration, which is expensive.
      
      The reset of the bit mask of the CPUs that need flushing after a
      non-local IDTE is tricky. As TLB entries for an address space remain
      in the TLB even if the address space is detached a new bit field is
      required to keep track of attached CPUs vs. CPUs in the need of a
      flush. After a non-local flush with IDTE the bit-field of attached CPUs
      is copied to the bit-field of CPUs in need of a flush. The ordering
      of operations on cpu_attach_mask, attach_count and mm_cpumask(mm) is
      such that an underindication in mm_cpumask(mm) is prevented but an
      overindication in mm_cpumask(mm) is possible.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      1b948d6c
  14. 21 3月, 2014 2 次提交
  15. 21 2月, 2014 2 次提交
    • K
      s390/kvm: support collaborative memory management · b31288fa
      Konstantin Weitz 提交于
      This patch enables Collaborative Memory Management (CMM) for kvm
      on s390. CMM allows the guest to inform the host about page usage
      (see arch/s390/mm/cmm.c). The host uses this information to avoid
      swapping in unused pages in the page fault handler. Further, a CPU
      provided list of unused invalid pages is processed to reclaim swap
      space of not yet accessed unused pages.
      
      [ Martin Schwidefsky: patch reordering and cleanup ]
      Signed-off-by: NKonstantin Weitz <konstantin.weitz@gmail.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      b31288fa
    • M
      s390/mm,tlb: race of lazy TLB flush vs. recreation of TLB entries · 53e857f3
      Martin Schwidefsky 提交于
      Git commit 050eef36 "[S390] fix tlb flushing vs. concurrent
      /proc accesses" introduced the attach counter to avoid using the
      mm_users value to decide between IPTE for every PTE and lazy TLB
      flushing with IDTE. That fixed the problem with mm_users but it
      introduced another subtle race, fortunately one that is very hard
      to hit.
      The background is the requirement of the architecture that a valid
      PTE may not be changed while it can be used concurrently by another
      cpu. The decision between IPTE and lazy TLB flushing needs to be
      done while the PTE is still valid. Now if the virtual cpu is
      temporarily stopped after the decision to use lazy TLB flushing but
      before the invalid bit of the PTE has been set, another cpu can attach
      the mm, find that flush_mm is set, do the IDTE, return to userspace,
      and recreate a TLB that uses the PTE in question. When the first,
      stopped cpu continues it will change the PTE while it is attached on
      another cpu. The first cpu will do another IDTE shortly after the
      modification of the PTE which makes the race window quite short.
      
      To fix this race the CPU that wants to attach the address space of a
      user space thread needs to wait for the end of the PTE modification.
      The number of concurrent TLB flushers for an mm is tracked in the
      upper 16 bits of the attach_count and finish_arch_post_lock_switch
      is used to wait for the end of the flush operation if required.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      53e857f3
  16. 30 1月, 2014 1 次提交
  17. 15 10月, 2013 1 次提交
  18. 29 8月, 2013 1 次提交
    • M
      s390/mm: implement software referenced bits · 0944fe3f
      Martin Schwidefsky 提交于
      The last remaining use for the storage key of the s390 architecture
      is reference counting. The alternative is to make page table entries
      invalid while they are old. On access the fault handler marks the
      pte/pmd as young which makes the pte/pmd valid if the access rights
      allow read access. The pte/pmd invalidations required for software
      managed reference bits cost a bit of performance, on the other hand
      the RRBE/RRBM instructions to read and reset the referenced bits are
      quite expensive as well.
      Reviewed-by: NGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      0944fe3f
  19. 28 8月, 2013 1 次提交
  20. 22 8月, 2013 4 次提交
  21. 29 7月, 2013 1 次提交
  22. 29 6月, 2013 1 次提交
  23. 20 6月, 2013 1 次提交