1. 14 1月, 2011 2 次提交
  2. 25 10月, 2010 1 次提交
  3. 24 8月, 2010 1 次提交
    • S
      x86, mm: Avoid unnecessary TLB flush · 61c77326
      Shaohua Li 提交于
      In x86, access and dirty bits are set automatically by CPU when CPU accesses
      memory. When we go into the code path of below flush_tlb_fix_spurious_fault(),
      we already set dirty bit for pte and don't need flush tlb. This might mean
      tlb entry in some CPUs hasn't dirty bit set, but this doesn't matter. When
      the CPUs do page write, they will automatically check the bit and no software
      involved.
      
      On the other hand, flush tlb in below position is harmful. Test creates CPU
      number of threads, each thread writes to a same but random address in same vma
      range and we measure the total time. Under a 4 socket system, original time is
      1.96s, while with the patch, the time is 0.8s. Under a 2 socket system, there is
      20% time cut too. perf shows a lot of time are taking to send ipi/handle ipi for
      tlb flush.
      Signed-off-by: NShaohua Li <shaohua.li@intel.com>
      LKML-Reference: <20100816011655.GA362@sli10-desk.sh.intel.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Andrea Archangeli <aarcange@redhat.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      61c77326
  4. 23 6月, 2009 1 次提交
    • P
      asm-generic: add dummy pgprot_noncached() · 0634a632
      Paul Mundt 提交于
      Most architectures now provide a pgprot_noncached(), the
      remaining ones can simply use an dummy default implementation,
      except for cris and xtensa, which should override the
      default appropriately.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Magnus Damm <magnus.damm@gmail.com>
      0634a632
  5. 30 3月, 2009 2 次提交
  6. 14 1月, 2009 1 次提交
  7. 20 12月, 2008 1 次提交
  8. 19 12月, 2008 1 次提交
  9. 16 7月, 2008 1 次提交
    • S
      mm: fix build on non-mmu machines · fe1a6875
      Sebastian Siewior 提交于
      Commit 1ea0704e aka "mm: add a ptep_modify_prot transaction abstraction"
      
      caused:
      
      |  CC      init/main.o
      |In file included from include2/asm/pgtable.h:68,
      |                 from /home/bigeasy/git/linux-2.6-m68k/include/linux/mm.h:39,
      |                 from include2/asm/uaccess.h:8,
      |                 from /home/bigeasy/git/linux-2.6-m68k/include/linux/poll.h:13,
      |                 from /home/bigeasy/git/linux-2.6-m68k/include/linux/rtc.h:113,
      |                 from /home/bigeasy/git/linux-2.6-m68k/include/linux/efi.h:19,
      |                 from /home/bigeasy/git/linux-2.6-m68k/init/main.c:43:
      |/linux-2.6/include/asm-generic/pgtable.h: In function '__ptep_modify_prot_start':
      |/linux-2.6/include/asm-generic/pgtable.h:209: error: implicit declaration of function 'ptep_get_and_clear'
      |/linux-2.6/include/asm-generic/pgtable.h:209: error: incompatible types in return
      |/linux-2.6/include/asm-generic/pgtable.h: In function '__ptep_modify_prot_commit':
      |/linux-2.6/include/asm-generic/pgtable.h:220: error: implicit declaration of function 'set_pte_at'
      |make[2]: *** [init/main.o] Error 1
      |make[1]: *** [init] Error 2
      |make: *** [sub-make] Error 2
      
      on my m68knommu box.
      Acked-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NSebastian Siewior <bigeasy@linutronix.de>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fe1a6875
  10. 25 6月, 2008 1 次提交
    • J
      mm: add a ptep_modify_prot transaction abstraction · 1ea0704e
      Jeremy Fitzhardinge 提交于
      This patch adds an API for doing read-modify-write updates to a pte's
      protection bits which may race against hardware updates to the pte.
      After reading the pte, the hardware may asynchonously set the accessed
      or dirty bits on a pte, which would be lost when writing back the
      modified pte value.
      
      The existing technique to handle this race is to use
      ptep_get_and_clear() atomically fetch the old pte value and clear it
      in memory.  This has the effect of marking the pte as non-present,
      which will prevent the hardware from updating its state.  When the new
      value is written back, the pte will be present again, and the hardware
      can resume updating the access/dirty flags.
      
      When running in a virtualized environment, pagetable updates are
      relatively expensive, since they generally involve some trap into the
      hypervisor.  To mitigate the cost of these updates, we tend to batch
      them.
      
      However, because of the atomic nature of ptep_get_and_clear(), it is
      inherently non-batchable.  This new interface allows batching by
      giving the underlying implementation enough information to open a
      transaction between the read and write phases:
      
      ptep_modify_prot_start() returns the current pte value, and puts the
        pte entry into a state where either the hardware will not update the
        pte, or if it does, the updates will be preserved on commit.
      
      ptep_modify_prot_commit() writes back the updated pte, makes sure that
        any hardware updates made since ptep_modify_prot_start() are
        preserved.
      
      ptep_modify_prot_start() and _commit() must be exactly paired, and
      used while holding the appropriate pte lock.  They do not protect
      against other software updates of the pte in any way.
      
      The current implementations of ptep_modify_prot_start and _commit are
      functionally unchanged from before: _start() uses ptep_get_and_clear()
      fetch the pte and zero the entry, preventing any hardware updates.
      _commit() simply writes the new pte value back knowing that the
      hardware has not updated the pte in the meantime.
      
      The only current user of this interface is mprotect
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1ea0704e
  11. 17 10月, 2007 1 次提交
  12. 12 8月, 2007 1 次提交
  13. 18 7月, 2007 2 次提交
  14. 17 6月, 2007 1 次提交
  15. 27 4月, 2007 1 次提交
    • M
      [S390] split page_test_and_clear_dirty. · 6c210482
      Martin Schwidefsky 提交于
      The page_test_and_clear_dirty primitive really consists of two
      operations, page_test_dirty and the page_clear_dirty. The combination
      of the two is not an atomic operation, so it makes more sense to have
      two separate operations instead of one.
      In addition to the improved readability of the s390 version of
      SetPageUptodate, it now avoids the page_test_dirty operation which is
      an insert-storage-key-extended (iske) instruction which is an expensive
      operation.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      6c210482
  16. 09 4月, 2007 1 次提交
  17. 13 2月, 2007 1 次提交
    • Z
      [PATCH] i386: paravirt CPU hypercall batching mode · 9226d125
      Zachary Amsden 提交于
      The VMI ROM has a mode where hypercalls can be queued and batched.  This turns
      out to be a significant win during context switch, but must be done at a
      specific point before side effects to CPU state are visible to subsequent
      instructions.  This is similar to the MMU batching hooks already provided.
      The same hooks could be used by the Xen backend to implement a context switch
      multicall.
      
      To explain a bit more about lazy modes in the paravirt patches, basically, the
      idea is that only one of lazy CPU or MMU mode can be active at any given time.
       Lazy MMU mode is similar to this lazy CPU mode, and allows for batching of
      multiple PTE updates (say, inside a remap loop), but to avoid keeping some
      kind of state machine about when to flush cpu or mmu updates, we just allow
      one or the other to be active.  Although there is no real reason a more
      comprehensive scheme could not be implemented, there is also no demonstrated
      need for this extra complexity.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      9226d125
  18. 01 10月, 2006 3 次提交
    • Z
      [PATCH] paravirt: remove set pte atomic · a93cb055
      Zachary Amsden 提交于
      Now that ptep_establish has a definition in PAE i386 3-level paging code, the
      only paging model which is insane enough to have multi-word hardware PTEs
      which are not efficient to set atomically, we can remove the ghost of
      set_pte_atomic from other architectures which falesly duplicated it, and
      remove all knowledge of it from the generic pgtable code.
      
      set_pte_atomic is now a private pte operator which is specific to i386
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a93cb055
    • Z
      [PATCH] paravirt: lazy mmu mode hooks.patch · 6606c3e0
      Zachary Amsden 提交于
      Implement lazy MMU update hooks which are SMP safe for both direct and shadow
      page tables.  The idea is that PTE updates and page invalidations while in
      lazy mode can be batched into a single hypercall.  We use this in VMI for
      shadow page table synchronization, and it is a win.  It also can be used by
      PPC and for direct page tables on Xen.
      
      For SMP, the enter / leave must happen under protection of the page table
      locks for page tables which are being modified.  This is because otherwise,
      you end up with stale state in the batched hypercall, which other CPUs can
      race ahead of.  Doing this under the protection of the locks guarantees the
      synchronization is correct, and also means that spurious faults which are
      generated during this window by remote CPUs are properly handled, as the page
      fault handler must re-check the PTE under protection of the same lock.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      6606c3e0
    • Z
      [PATCH] paravirt: pte clear not present · 9888a1ca
      Zachary Amsden 提交于
      Change pte_clear_full to a more appropriately named pte_clear_not_present,
      allowing optimizations when not-present mapping changes need not be reflected
      in the hardware TLB for protected page table modes.  There is also another
      case that can use it in the fremap code.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NJeremy Fitzhardinge <jeremy@xensource.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9888a1ca
  19. 26 9月, 2006 1 次提交
  20. 02 6月, 2006 1 次提交
    • D
      [SPARC64]: Fix D-cache corruption in mremap · 0b0968a3
      David S. Miller 提交于
      If we move a mapping from one virtual address to another,
      and this changes the virtual color of the mapping to those
      pages, we can see corrupt data due to D-cache aliasing.
      
      Check for and deal with this by overriding the move_pte()
      macro.  Set things up so that other platforms can cleanly
      override the move_pte() macro too.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0b0968a3
  21. 07 11月, 2005 1 次提交
  22. 30 10月, 2005 1 次提交
  23. 28 9月, 2005 1 次提交
    • N
      [PATCH] mm: move_pte to remap ZERO_PAGE · 8b1f3124
      Nick Piggin 提交于
      Move the ZERO_PAGE remapping complexity to the move_pte macro in
      asm-generic, have it conditionally depend on
      __HAVE_ARCH_MULTIPLE_ZERO_PAGE, which gets defined for MIPS.
      
      For architectures without __HAVE_ARCH_MULTIPLE_ZERO_PAGE, move_pte becomes
      a noop.
      
      From: Hugh Dickins <hugh@veritas.com>
      
      Fix nasty little bug we've missed in Nick's mremap move ZERO_PAGE patch.
      The "pte" at that point may be a swap entry or a pte_file entry: we must
      check pte_present before perhaps corrupting such an entry.
      
      Patch below against 2.6.14-rc2-mm1, but the same bug is in 2.6.14-rc2's
      mm/mremap.c, and more dangerous there since it's affecting all arches: I
      think the safest course is to send Nick's patch and Yoichi's build fix and
      this fix (build tested) on to Linus - so only MIPS can be affected.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      8b1f3124
  24. 05 9月, 2005 1 次提交
    • Z
      [PATCH] x86: ptep_clear optimization · a600388d
      Zachary Amsden 提交于
      Add a new accessor for PTEs, which passes the full hint from the mmu_gather
      struct; this allows architectures with hardware pagetables to optimize away
      atomic PTE operations when destroying an address space.  Removing the
      locked operation should allow better pipelining of memory access in this
      loop.  I measured an average savings of 30-35 cycles per zap_pte_range on
      the first 500 destructions on Pentium-M, but I believe the optimization
      would win more on older processors which still assert the bus lock on xchg
      for an exclusive cacheline.
      
      Update: I made some new measurements, and this saves exactly 26 cycles over
      ptep_get_and_clear on Pentium M.  On P4, with a PAE kernel, this saves 180
      cycles per ptep_get_and_clear, for a whopping 92160 cycles savings for a
      full address space destruction.
      
      pte_clear_full is not yet used, but is provided for future optimizations
      (in particular, when running inside of a hypervisor that queues page table
      updates, the full hint allows us to avoid queueing unnecessary page table
      update for an address space in the process of being destroyed.
      
      This is not a huge win, but it does help a bit, and sets the stage for
      further hypervisor optimization of the mm layer on all architectures.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Cc: Christoph Lameter <christoph@lameter.com>
      Cc: <linux-mm@kvack.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a600388d
  25. 22 6月, 2005 1 次提交
    • A
      [PATCH] msync: check pte dirty earlier · b4955ce3
      Abhijit Karmarkar 提交于
      It's common practice to msync a large address range regularly, in which
      often only a few ptes have actually been dirtied since the previous pass.
      
      sync_pte_range then goes much faster if it tests whether pte is dirty
      before locating and accessing each struct page cacheline; and it is hardly
      slowed by ptep_clear_flush_dirty repeating that test in the opposite case,
      when every pte actually is dirty.
      
      But beware, s390's pte_dirty always says false, since its dirty bit is kept
      in the storage key, located via the struct page address.  So skip this
      optimization in its case: use a pte_maybe_dirty macro which just says true
      if page_test_and_clear_dirty is implemented.
      Signed-off-by: NAbhijit Karmarkar <abhijitk@veritas.com>
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b4955ce3
  26. 20 4月, 2005 1 次提交
  27. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4