1. 25 2月, 2010 1 次提交
    • I
      x86, mm: Allow highmem user page tables to be disabled at boot time · 14315592
      Ian Campbell 提交于
      Distros generally (I looked at Debian, RHEL5 and SLES11) seem to
      enable CONFIG_HIGHPTE for any x86 configuration which has highmem
      enabled. This means that the overhead applies even to machines which
      have a fairly modest amount of high memory and which therefore do not
      really benefit from allocating PTEs in high memory but still pay the
      price of the additional mapping operations.
      
      Running kernbench on a 4G box I found that with CONFIG_HIGHPTE=y but
      no actual highptes being allocated there was a reduction in system
      time used from 59.737s to 55.9s.
      
      With CONFIG_HIGHPTE=y and highmem PTEs being allocated:
        Average Optimal load -j 4 Run (std deviation):
        Elapsed Time 175.396 (0.238914)
        User Time 515.983 (5.85019)
        System Time 59.737 (1.26727)
        Percent CPU 263.8 (71.6796)
        Context Switches 39989.7 (4672.64)
        Sleeps 42617.7 (246.307)
      
      With CONFIG_HIGHPTE=y but with no highmem PTEs being allocated:
        Average Optimal load -j 4 Run (std deviation):
        Elapsed Time 174.278 (0.831968)
        User Time 515.659 (6.07012)
        System Time 55.9 (1.07799)
        Percent CPU 263.8 (71.266)
        Context Switches 39929.6 (4485.13)
        Sleeps 42583.7 (373.039)
      
      This patch allows the user to control the allocation of PTEs in
      highmem from the command line ("userpte=nohigh") but retains the
      status-quo as the default.
      
      It is possible that some simple heuristic could be developed which
      allows auto-tuning of this option however I don't have a sufficiently
      large machine available to me to perform any particularly meaningful
      experiments. We could probably handwave up an argument for a threshold
      at 16G of total RAM.
      
      Assuming 768M of lowmem we have 196608 potential lowmem PTE
      pages. Each page can map 2M of RAM in a PAE-enabled configuration,
      meaning a maximum of 384G of RAM could potentially be mapped using
      lowmem PTEs.
      
      Even allowing generous factor of 10 to account for other required
      lowmem allocations, generous slop to account for page sharing (which
      reduces the total amount of RAM mappable by a given number of PT
      pages) and other innacuracies in the estimations it would seem that
      even a 32G machine would not have a particularly pressing need for
      highmem PTEs. I think 32G could be considered to be at the upper bound
      of what might be sensible on a 32 bit machine (although I think in
      practice 64G is still supported).
      
      It's seems questionable if HIGHPTE is even a win for any amount of RAM
      you would sensibly run a 32 bit kernel on rather than going 64 bit.
      Signed-off-by: NIan Campbell <ian.campbell@citrix.com>
      LKML-Reference: <1266403090-20162-1-git-send-email-ian.campbell@citrix.com>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      14315592
  2. 28 7月, 2009 1 次提交
    • B
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb() · 9e1b32ca
      Benjamin Herrenschmidt 提交于
      mm: Pass virtual address to [__]p{te,ud,md}_free_tlb()
      
      Upcoming paches to support the new 64-bit "BookE" powerpc architecture
      will need to have the virtual address corresponding to PTE page when
      freeing it, due to the way the HW table walker works.
      
      Basically, the TLB can be loaded with "large" pages that cover the whole
      virtual space (well, sort-of, half of it actually) represented by a PTE
      page, and which contain an "indirect" bit indicating that this TLB entry
      RPN points to an array of PTEs from which the TLB can then create direct
      entries. Thus, in order to invalidate those when PTE pages are deleted,
      we need the virtual address to pass to tlbilx or tlbivax instructions.
      
      The old trick of sticking it somewhere in the PTE page struct page sucks
      too much, the address is almost readily available in all call sites and
      almost everybody implemets these as macros, so we may as well add the
      argument everywhere. I added it to the pmd and pud variants for consistency.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: David Howells <dhowells@redhat.com> [MN10300 & FRV]
      Acked-by: NNick Piggin <npiggin@suse.de>
      Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9e1b32ca
  3. 24 1月, 2009 1 次提交
    • P
      x86, mm: fix pte_free() · 42ef73fe
      Peter Zijlstra 提交于
      On -rt we were seeing spurious bad page states like:
      
      Bad page state in process 'firefox'
      page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0
      Trying to fix it up, but a reboot is needed
      Backtrace:
      Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3
      [<c043d0f3>] ? printk+0x14/0x19
      [<c0272d4e>] bad_page+0x4e/0x79
      [<c0273831>] free_hot_cold_page+0x5b/0x1d3
      [<c02739f6>] free_hot_page+0xf/0x11
      [<c0273a18>] __free_pages+0x20/0x2b
      [<c027d170>] __pte_alloc+0x87/0x91
      [<c027d25e>] handle_mm_fault+0xe4/0x733
      [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
      [<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
      [<c0218875>] do_page_fault+0x36f/0x88a
      
      This is the case where a concurrent fault already installed the PTE and
      we get to free the newly allocated one.
      
      This is due to pgtable_page_ctor() doing the spin_lock_init(&page->ptl)
      which is overlaid with the {private, mapping} struct.
      
      union {
          struct {
              unsigned long private;
              struct address_space *mapping;
          };
          spinlock_t ptl;
          struct kmem_cache *slab;
          struct page *first_page;
      };
      
      Normally the spinlock is small enough to not stomp on page->mapping, but
      PREEMPT_RT=y has huge 'spin'locks.
      
      But lockdep kernels should also be able to trigger this splat, as the
      lock tracking code grows the spinlock to cover page->mapping.
      
      The obvious fix is calling pgtable_page_dtor() like the regular pte free
      path __pte_free_tlb() does.
      
      It seems all architectures except x86 and nm10300 already do this, and
      nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it
      doesn't do SMP or simply doesnt do MMU at all or something.
      Signed-off-by: NPeter Zijlstra <a.p.zijlsta@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: <stable@kernel.org>
      42ef73fe
  4. 23 10月, 2008 2 次提交
  5. 23 7月, 2008 1 次提交
    • V
      x86: consolidate header guards · 77ef50a5
      Vegard Nossum 提交于
      This patch is the result of an automatic script that consolidates the
      format of all the headers in include/asm-x86/.
      
      The format:
      
      1. No leading underscore. Names with leading underscores are reserved.
      2. Pathname components are separated by two underscores. So we can
         distinguish between mm_types.h and mm/types.h.
      3. Everything except letters and numbers are turned into single
         underscores.
      Signed-off-by: NVegard Nossum <vegard.nossum@gmail.com>
      77ef50a5
  6. 08 7月, 2008 1 次提交
    • J
      x86/paravirt: add a pgd_alloc/free hooks · eba0045f
      Jeremy Fitzhardinge 提交于
      Add hooks which are called at pgd_alloc/free time.  The pgd_alloc hook
      may return an error code, which if non-zero, causes the pgd allocation
      to be failed.  The hooks may be used to allocate/free auxillary
      per-pgd information.
      
      also fix:
      
      > * Ingo Molnar <mingo@elte.hu> wrote:
      >
      >  include/asm/pgalloc.h: In function ‘paravirt_pgd_free':
      >  include/asm/pgalloc.h:14: error: parameter name omitted
      >  arch/x86/kernel/entry_64.S: In file included from
      >  arch/x86/kernel/traps_64.c:51:include/asm/pgalloc.h: In function ‘paravirt_pgd_free':
      >  include/asm/pgalloc.h:14: error: parameter name omitted
      Signed-off-by: NJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Cc: xen-devel <xen-devel@lists.xensource.com>
      Cc: Stephen Tweedie <sct@redhat.com>
      Cc: Eduardo Habkost <ehabkost@redhat.com>
      Cc: Mark McLoughlin <markmc@redhat.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      eba0045f
  7. 25 4月, 2008 8 次提交
  8. 11 10月, 2007 1 次提交