1. 24 5月, 2012 1 次提交
  2. 11 4月, 2012 1 次提交
    • M
      [S390] fix tlb flushing for page table pages · cd94154c
      Martin Schwidefsky 提交于
      Git commit 36409f63 "use generic RCU
      page-table freeing code" introduced a tlb flushing bug. Partially revert
      the above git commit and go back to s390 specific page table flush code.
      
      For s390 the TLB can contain three types of entries, "normal" TLB
      page-table entries, TLB combined region-and-segment-table (CRST) entries
      and real-space entries. Linux does not use real-space entries which
      leaves normal TLB entries and CRST entries. The CRST entries are
      intermediate steps in the page-table translation called translation paths.
      For example a 4K page access in a three-level page table setup will
      create two CRST TLB entries and one page-table TLB entry. The advantage
      of that approach is that a page access next to the previous one can reuse
      the CRST entries and needs just a single read from memory to create the
      page-table TLB entry. The disadvantage is that the TLB flushing rules are
      more complicated, before any page-table may be freed the TLB needs to be
      flushed.
      
      In short: the generic RCU page-table freeing code is incorrect for the
      CRST entries, in particular the check for mm_users < 2 is troublesome.
      
      This is applicable to 3.0+ kernels.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      cd94154c
  3. 06 6月, 2011 1 次提交
    • M
      [S390] use generic RCU page-table freeing code · 36409f63
      Martin Schwidefsky 提交于
      Replace the s390 specific rcu page-table freeing code with the
      generic variant. This requires to duplicate the definition for the
      struct mmu_table_batch as s390 does not use the generic tlb flush
      code.
      
      While we are at it remove the restriction that page table fragments
      can not be reused after a single fragment has been freed with rcu
      and split out allocation and freeing of page tables with pgstes.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      36409f63
  4. 25 5月, 2011 1 次提交
  5. 31 1月, 2011 1 次提交
  6. 25 10月, 2010 1 次提交
  7. 24 8月, 2010 1 次提交
    • M
      [S390] fix tlb flushing vs. concurrent /proc accesses · 050eef36
      Martin Schwidefsky 提交于
      The tlb flushing code uses the mm_users field of the mm_struct to
      decide if each page table entry needs to be flushed individually with
      IPTE or if a global flush for the mm_struct is sufficient after all page
      table updates have been done. The comment for mm_users says "How many
      users with user space?" but the /proc code increases mm_users after it
      found the process structure by pid without creating a new user process.
      Which makes mm_users useless for the decision between the two tlb
      flusing methods. The current code can be confused to not flush tlb
      entries by a concurrent access to /proc files if e.g. a fork is in
      progres. The solution for this problem is to make the tlb flushing
      logic independent from the mm_users field.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      050eef36
  8. 28 7月, 2009 1 次提交
    • B
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() · 9e1b32ca
      Benjamin Herrenschmidt 提交于
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()
      
      Upcoming paches to support the new 64-bit "BookE" powerpc architecture
      will need to have the virtual address corresponding to PTE page when
      freeing it, due to the way the HW table walker works.
      
      Basically, the TLB can be loaded with "large" pages that cover the whole
      virtual space (well, sort-of, half of it actually) represented by a PTE
      page, and which contain an "indirect" bit indicating that this TLB entry
      RPN points to an array of PTEs from which the TLB can then create direct
      entries. Thus, in order to invalidate those when PTE pages are deleted,
      we need the virtual address to pass to tlbilx or tlbivax instructions.
      
      The old trick of sticking it somewhere in the PTE page struct page sucks
      too much, the address is almost readily available in all call sites and
      almost everybody implemets these as macros, so we may as well add the
      argument everywhere. I added it to the pmd and pud variants for consistency.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV]
      Acked-by: NNick Piggin <npiggin@suse.de>
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9e1b32ca
  9. 02 8月, 2008 1 次提交
  10. 10 2月, 2008 3 次提交
  11. 09 2月, 2008 1 次提交
    • M
      CONFIG_HIGHPTE vs. sub-page page tables. · 2f569afd
      Martin Schwidefsky 提交于
      Background: I've implemented 1K/2K page tables for s390.  These sub-page
      page tables are required to properly support the s390 virtualization
      instruction with KVM.  The SIE instruction requires that the page tables
      have 256 page table entries (pte) followed by 256 page status table entries
      (pgste).  The pgstes are only required if the process is using the SIE
      instruction.  The pgstes are updated by the hardware and by the hypervisor
      for a number of reasons, one of them is dirty and reference bit tracking.
      To avoid wasting memory the standard pte table allocation should return
      1K/2K (31/64 bit) and 2K/4K if the process is using SIE.
      
      Problem: Page size on s390 is 4K, page table size is 1K or 2K.  That means
      the s390 version for pte_alloc_one cannot return a pointer to a struct
      page.  Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
      cannot return a pointer to a pte either, since that would require more than
      32 bit for the return value of pte_alloc_one (and the pte * would not be
      accessible since its not kmapped).
      
      Solution: The only solution I found to this dilemma is a new typedef: a
      pgtable_t.  For s390 pgtable_t will be a (pte *) - to be introduced with a
      later patch.  For everybody else it will be a (struct page *).  The
      additional problem with the initialization of the ptl lock and the
      NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
      a destructor pgtable_page_dtor.  The page table allocation and free
      functions need to call these two whenever a page table page is allocated or
      freed.  pmd_populate will get a pgtable_t instead of a struct page pointer.
       To get the pgtable_t back from a pmd entry that has been installed with
      pmd_populate a new function pmd_pgtable is added.  It replaces the pmd_page
      call in free_pte_range and apply_to_pte_range.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f569afd
  12. 06 2月, 2008 1 次提交
  13. 22 10月, 2007 2 次提交
    • M
      [S390] 4level-fixup cleanup · 190a1d72
      Martin Schwidefsky 提交于
      Get independent from asm-generic/4level-fixup.h
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      190a1d72
    • M
      [S390] tlb flush fix. · ba8a9229
      Martin Schwidefsky 提交于
      The current tlb flushing code for page table entries violates the
      s390 architecture in a small detail. The relevant section from the
      principles of operation (SA22-7832-02 page 3-47):
      
         "A valid table entry must not be changed while it is attached
         to any CPU and may be used for translation by that CPU except to
         (1) invalidate the entry by using INVALIDATE PAGE TABLE ENTRY or
         INVALIDATE DAT TABLE ENTRY, (2) alter bits 56-63 of a page-table
         entry, or (3) make a change by means of a COMPARE AND SWAP AND
         PURGE instruction that purges the TLB."
      
      That means if one thread of a multithreaded applciation uses a vma
      while another thread does an unmap on it, the page table entries of
      that vma needs to get removed with IPTE, IDTE or CSP. In some strange
      and rare situations a cpu could check-stop (die) because a entry has
      been pushed out of the TLB that is still needed to complete a
      (milli-coded) instruction. I've never seen it happen with the current
      code on any of the supported machines, so right now this is a
      theoretical problem. But I want to fix it nevertheless, to avoid
      headaches in the futures.
      
      To get this implemented correctly without changing common code the
      primitives ptep_get_and_clear, ptep_get_and_clear_full and
      ptep_set_wrprotect need to use the IPTE instruction to invalidate the
      pte before the new pte value gets stored. If IPTE is always used for
      the three primitives three important operations will have a performace
      hit: fork, mprotect and exit_mmap. Time for some workarounds:
      
      * 1: ptep_get_and_clear_full is used in unmap_vmas to remove page
      tables entries in a batched tlb gather operation. If the mmu_gather
      context passed to unmap_vmas has been started with full_mm_flush==1
      or if only one cpu is online or if the only user of a mm_struct is the
      current process then the fullmm indication in the mmu_gather context is
      set to one. All TLBs for mm_struct are flushed by the tlb_gather_mmu
      call. No new TLBs can be created while the unmap is in progress. In
      this case ptep_get_and_clear_full clears the ptes with a simple store.
      
      * 2: ptep_get_and_clear is used in change_protection to clear the
      ptes from the page tables before they are reentered with the new
      access flags. At the end of the update flush_tlb_range clears the
      remaining TLBs. In general the ptep_get_and_clear has to issue IPTE
      for each pte and flush_tlb_range is a nop. But if there is only one
      user of the mm_struct then ptep_get_and_clear uses simple stores
      to do the update and flush_tlb_range will flush the TLBs.
      
      * 3: Similar to 2, ptep_set_wrprotect is used in copy_page_range
      for a fork to make all ptes of a cow mapping read-only. At the end of
      of copy_page_range dup_mmap will flush the TLBs with a call to
      flush_tlb_mm.  Check for mm->mm_users and if there is only one user
      avoid using IPTE in ptep_set_wrprotect and let flush_tlb_mm clear the
      TLBs.
      
      Overall for single threaded programs the tlb flush code now performs
      better, for multi threaded programs it is slightly worse. In particular
      exit_mmap() now does a single IDTE for the mm and then just frees every
      page cache reference and every page table page directly without a delay
      over the mmu_gather structure.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      ba8a9229
  14. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4