1. 09 2月, 2008 1 次提交
    • M
      CONFIG_HIGHPTE vs. sub-page page tables. · 2f569afd
      Martin Schwidefsky 提交于
      Background: I've implemented 1K/2K page tables for s390.  These sub-page
      page tables are required to properly support the s390 virtualization
      instruction with KVM.  The SIE instruction requires that the page tables
      have 256 page table entries (pte) followed by 256 page status table entries
      (pgste).  The pgstes are only required if the process is using the SIE
      instruction.  The pgstes are updated by the hardware and by the hypervisor
      for a number of reasons, one of them is dirty and reference bit tracking.
      To avoid wasting memory the standard pte table allocation should return
      1K/2K (31/64 bit) and 2K/4K if the process is using SIE.
      
      Problem: Page size on s390 is 4K, page table size is 1K or 2K.  That means
      the s390 version for pte_alloc_one cannot return a pointer to a struct
      page.  Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
      cannot return a pointer to a pte either, since that would require more than
      32 bit for the return value of pte_alloc_one (and the pte * would not be
      accessible since its not kmapped).
      
      Solution: The only solution I found to this dilemma is a new typedef: a
      pgtable_t.  For s390 pgtable_t will be a (pte *) - to be introduced with a
      later patch.  For everybody else it will be a (struct page *).  The
      additional problem with the initialization of the ptl lock and the
      NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
      a destructor pgtable_page_dtor.  The page table allocation and free
      functions need to call these two whenever a page table page is allocated or
      freed.  pmd_populate will get a pgtable_t instead of a struct page pointer.
       To get the pgtable_t back from a pmd entry that has been installed with
      pmd_populate a new function pmd_pgtable is added.  It replaces the pmd_page
      call in free_pte_range and apply_to_pte_range.
      Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2f569afd
  2. 04 2月, 2008 1 次提交
  3. 30 1月, 2008 11 次提交
  4. 17 10月, 2007 1 次提交
  5. 11 10月, 2007 1 次提交
  6. 18 7月, 2007 1 次提交
    • M
      Add __GFP_MOVABLE for callers to flag allocations from high memory that may be migrated · 769848c0
      Mel Gorman 提交于
      It is often known at allocation time whether a page may be migrated or not.
      This patch adds a flag called __GFP_MOVABLE and a new mask called
      GFP_HIGH_MOVABLE.  Allocations using the __GFP_MOVABLE can be either migrated
      using the page migration mechanism or reclaimed by syncing with backing
      storage and discarding.
      
      An API function very similar to alloc_zeroed_user_highpage() is added for
      __GFP_MOVABLE allocations called alloc_zeroed_user_highpage_movable().  The
      flags used by alloc_zeroed_user_highpage() are not changed because it would
      change the semantics of an existing API.  After this patch is applied there
      are no in-kernel users of alloc_zeroed_user_highpage() so it probably should
      be marked deprecated if this patch is merged.
      
      Note that this patch includes a minor cleanup to the use of __GFP_ZERO in
      shmem.c to keep all flag modifications to inode->mapping in the
      shmem_dir_alloc() helper function.  This clean-up suggestion is courtesy of
      Hugh Dickens.
      
      Additional credit goes to Christoph Lameter and Linus Torvalds for shaping the
      concept.  Credit to Hugh Dickens for catching issues with shmem swap vector
      and ramfs allocations.
      
      [akpm@linux-foundation.org: build fix]
      [hugh@veritas.com: __GFP_ZERO cleanup]
      Signed-off-by: NMel Gorman <mel@csn.ul.ie>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Christoph Lameter <clameter@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      769848c0
  7. 11 5月, 2007 1 次提交
  8. 09 5月, 2007 2 次提交
  9. 07 5月, 2007 1 次提交
    • L
      Revert "[PATCH] x86: __pa and __pa_symbol address space separation" · e3ebadd9
      Linus Torvalds 提交于
      This was broken.  It adds complexity, for no good reason.  Rather than
      separate __pa() and __pa_symbol(), we should deprecate __pa_symbol(),
      and preferably __pa() too - and just use "virt_to_phys()" instead, which
      is more readable and has nicer semantics.
      
      However, right now, just undo the separation, and make __pa_symbol() be
      the exact same as __pa().  That fixes the bugs this patch introduced,
      and we can do the fairly obvious cleanups later.
      
      Do the new __phys_addr() function (which is now the actual workhorse for
      the unified __pa()/__pa_symbol()) as a real external function, that way
      all the potential issues with compile/link-time optimizations of
      constant symbol addresses go away, and we can also, if we choose to, add
      more sanity-checking of the argument.
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e3ebadd9
  10. 03 5月, 2007 4 次提交
    • V
      [PATCH] x86-64: build-time checking · 6a50a664
      Vivek Goyal 提交于
      o X86_64 kernel should run from 2MB aligned address for two reasons.
      	- Performance.
      	- For relocatable kernels, page tables are updated based on difference
      	  between compile time address and load time physical address.
      	  This difference should be multiple of 2MB as kernel text and data
      	  is mapped using 2MB pages and PMD should be pointing to a 2MB
      	  aligned address. Life is simpler if both compile time and load time
      	  kernel addresses are 2MB aligned.
      
      o Flag the error at compile time if one is trying to build a kernel which
        does not meet alignment restrictions.
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      6a50a664
    • V
      [PATCH] x86-64: Relocatable Kernel Support · 1ab60e0f
      Vivek Goyal 提交于
      This patch modifies the x86_64 kernel so that it can be loaded and run
      at any 2M aligned address, below 512G.  The technique used is to
      compile the decompressor with -fPIC and modify it so the decompressor
      is fully relocatable.  For the main kernel the page tables are
      modified so the kernel remains at the same virtual address.  In
      addition a variable phys_base is kept that holds the physical address
      the kernel is loaded at.  __pa_symbol is modified to add that when
      we take the address of a kernel symbol.
      
      When loaded with a normal bootloader the decompressor will decompress
      the kernel to 2M and it will run there.  This both ensures the
      relocation code is always working, and makes it easier to use 2M
      pages for the kernel and the cpu.
      
      AK: changed to not make RELOCATABLE default in Kconfig
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      1ab60e0f
    • V
      [PATCH] x86: __pa and __pa_symbol address space separation · 0dbf7028
      Vivek Goyal 提交于
      Currently __pa_symbol is for use with symbols in the kernel address
      map and __pa is for use with pointers into the physical memory map.
      But the code is implemented so you can usually interchange the two.
      
      __pa which is much more common can be implemented much more cheaply
      if it is it doesn't have to worry about any other kernel address
      spaces.  This is especially true with a relocatable kernel as
      __pa_symbol needs to peform an extra variable read to resolve
      the address.
      
      There is a third macro that is added for the vsyscall data
      __pa_vsymbol for finding the physical addesses of vsyscall pages.
      
      Most of this patch is simply sorting through the references to
      __pa or __pa_symbol and using the proper one.  A little of
      it is continuing to use a physical address when we have it
      instead of recalculating it several times.
      
      swapper_pgd is now NULL.  leave_mm now uses init_mm.pgd
      and init_mm.pgd is initialized at boot (instead of compile time)
      to the physmem virtual mapping of init_level4_pgd.  The
      physical address changed.
      
      Except for the for EMPTY_ZERO page all of the remaining references
      to __pa_symbol appear to be during kernel initialization.  So this
      should reduce the cost of __pa in the common case, even on a relocated
      kernel.
      
      As this is technically a semantic change we need to be on the lookout
      for anything I missed.  But it works for me (tm).
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      0dbf7028
    • V
      [PATCH] x86-64: Assembly safe page.h and pgtable.h · 9d291e78
      Vivek Goyal 提交于
      This patch makes pgtable.h and page.h safe to include
      in assembly files like head.S.  Allowing us to use
      symbolic constants instead of hard coded numbers when
      refering to the page tables.
      
      This patch copies asm-sparc64/const.h to asm-x86_64 to
      get a definition of _AC() a very convinient macro that
      allows us to force the type when we are compiling the
      code in C and to drop all of the type information when
      we are using the constant in assembly.  Previously this
      was done with multiple definition of the same constant.
      const.h was modified slightly so that it works when given
      CONFIG options as arguments.
      
      This patch adds #ifndef __ASSEMBLY__ ... #endif
      and _AC(1,UL) where appropriate so the assembler won't
      choke on the header files.  Otherwise nothing
      should have changed.
      
      AK: added const.h to exported headers to fix headers_check
      Signed-off-by: NEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NVivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      9d291e78
  11. 29 7月, 2006 1 次提交
    • B
      [PATCH] x86_64: Enlarge debug stack for nested kprobes · a4045dff
      bibo mao 提交于
      In x86_64 platform, INT1 and INT3 trap stack is IST stack called DEBUG_STACK,
      when INT1/INT3 trap happens, system will switch to DEBUG_STACK by hardware.
      Current DEBUG_STACK size is 4K, when int1/int3 trap happens, kernel will
      minus current DEBUG_STACK IST value by 4k. But if int3/int1 trap is nested,
      it will destroy other vector's IST stack. This patch modifies this, it sets
      DEBUG_STACK size as 8K and allows two level of nested int1/int3 trap.
      
      Kprobe DEBUG_STACK may be nested, because kprobe handler may be probed
      by other kprobes.
      
      Thanks jbeulich for pointing out error in the first patch.
      
      [AK: nested kprobes are pretty dubious. Hopefully one nest
      will be enough. This will cost 8K per CPU (4K more than before)]
      Signed-off-by: Nbibo, mao <bibo.mao@intel.com>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a4045dff
  12. 27 4月, 2006 1 次提交
  13. 26 4月, 2006 1 次提交
  14. 28 3月, 2006 1 次提交
  15. 17 1月, 2006 1 次提交
  16. 12 1月, 2006 1 次提交
  17. 15 11月, 2005 1 次提交
  18. 13 9月, 2005 1 次提交
  19. 05 9月, 2005 2 次提交
    • C
      [PATCH] remove hugetlb_clean_stale_pgtable() and fix huge_pte_alloc() · 0e5c9f39
      Chen, Kenneth W 提交于
      I don't think we need to call hugetlb_clean_stale_pgtable() anymore
      in 2.6.13 because of the rework with free_pgtables().  It now collect
      all the pte page at the time of munmap.  It used to only collect page
      table pages when entire one pgd can be freed and left with staled pte
      pages.  Not anymore with 2.6.13.  This function will never be called
      and We should turn it into a BUG_ON.
      
      I also spotted two problems here, not Adam's fault :-)
      (1) in huge_pte_alloc(), it looks like a bug to me that pud is not
          checked before calling pmd_alloc()
      (2) in hugetlb_clean_stale_pgtable(), it also missed a call to
          pmd_free_tlb.  I think a tlb flush is required to flush the mapping
          for the page table itself when we clear out the pmd pointing to a
          pte page.  However, since hugetlb_clean_stale_pgtable() is never
          called, so it won't trigger the bug.
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0e5c9f39
    • S
      [PATCH] mm: consolidate get_order · fd4fd5aa
      Stephen Rothwell 提交于
      Someone mentioned that almost all the architectures used basically the same
      implementation of get_order.  This patch consolidates them into
      asm-generic/page.h and includes that in the appropriate places.  The
      exceptions are ia64 and ppc which have their own (presumably optimised)
      versions.
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      fd4fd5aa
  20. 26 6月, 2005 1 次提交
    • E
      [PATCH] kexec: x86_64: add CONFIG_PHYSICAL_START · d0537508
      Eric W. Biederman 提交于
      For one kernel to report a crash another kernel has created we need
      to have 2 kernels loaded simultaneously in memory.  To accomplish this
      the two kernels need to built to run at different physical addresses.
      
      This patch adds the CONFIG_PHYSICAL_START option to the x86_64 kernel
      so we can do just that.  You need to know what you are doing and
      the ramifications are before changing this value, and most users
      won't care so I have made it depend on CONFIG_EMBEDDED
      
      bzImage kernels will work and run at a different address when compiled
      with this option but they will still load at 1MB.  If you need a kernel
      loaded at a different address as well you need to boot a vmlinux.
      Signed-off-by: NEric Biederman <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d0537508
  21. 24 6月, 2005 1 次提交
  22. 22 6月, 2005 1 次提交
    • D
      [PATCH] Hugepage consolidation · 63551ae0
      David Gibson 提交于
      A lot of the code in arch/*/mm/hugetlbpage.c is quite similar.  This patch
      attempts to consolidate a lot of the code across the arch's, putting the
      combined version in mm/hugetlb.c.  There are a couple of uglyish hacks in
      order to covert all the hugepage archs, but the result is a very large
      reduction in the total amount of code.  It also means things like hugepage
      lazy allocation could be implemented in one place, instead of six.
      
      Tested, at least a little, on ppc64, i386 and x86_64.
      
      Notes:
      	- this patch changes the meaning of set_huge_pte() to be more
      	  analagous to set_pte()
      	- does SH4 need s special huge_ptep_get_and_clear()??
      Acked-by: NWilliam Lee Irwin <wli@holomorphy.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      63551ae0
  23. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4