1. 28 6月, 2006 1 次提交
  2. 27 6月, 2006 1 次提交
  3. 23 6月, 2006 4 次提交
  4. 22 5月, 2006 1 次提交
  5. 10 4月, 2006 1 次提交
  6. 28 3月, 2006 3 次提交
  7. 23 3月, 2006 4 次提交
    • A
      [PATCH] pause_on_oops command line option · dd287796
      Andrew Morton 提交于
      Attempt to fix the problem wherein people's oops reports scroll off the screen
      due to repeated oopsing or to oopses on other CPUs.
      
      If this happens the user can reboot with the `pause_on_oops=<seconds>' option.
      It will allow the first oopsing CPU to print an oops record just a single
      time.  Second oopsing attempts, or oopses on other CPUs will cause those CPUs
      to enter a tight loop until the specified number of seconds have elapsed.
      
      The patch implements the infrastructure generically in the expectation that
      architectures other than x86 will find it useful.
      
      Cc: Dave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dd287796
    • I
      [PATCH] make bug messages more consistent · 91368d73
      Ingo Molnar 提交于
      Consolidate all kernel bug printouts to begin with the "BUG: " string.
      Makes it easier to find them in large bootup logs.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      91368d73
    • J
      [PATCH] i386: actively synchronize vmalloc area when registering certain callbacks · 101f12af
      Jan Beulich 提交于
      Registering a callback handler through register_die_notifier() is obviously
      primarily intended for use by modules.  However, the way these currently
      get called it is basically impossible for them to actually be used by
      modules, as there is, on non-PAE configurationes, a good chance (the larger
      the module, the better) for the system to crash as a result.
      
      This is because the callback gets invoked
      
      (a) in the page fault path before the top level page table propagation
          gets carried out (hence a fault to propagate the top level page table
          entry/entries mapping to module's code/data would nest infinitly) and
      
      (b) in the NMI path, where nested faults must absolutely not happen,
          since otherwise the IRET from the nested fault re-enables NMIs,
          potentially resulting in nested NMI occurences.
      
      Besides the modular aspect, similar problems would even arise for in-
      kernel consumers of the API if they touched ioremap()ed or vmalloc()ed
      memory inside their handlers.
      Signed-off-by: NJan Beulich <jbeulich@novell.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      101f12af
    • G
      [PATCH] x86: SMP alternatives · 9a0b5817
      Gerd Hoffmann 提交于
      Implement SMP alternatives, i.e.  switching at runtime between different
      code versions for UP and SMP.  The code can patch both SMP->UP and UP->SMP.
      The UP->SMP case is useful for CPU hotplug.
      
      With CONFIG_CPU_HOTPLUG enabled the code switches to UP at boot time and
      when the number of CPUs goes down to 1, and switches to SMP when the number
      of CPUs goes up to 2.
      
      Without CONFIG_CPU_HOTPLUG or on non-SMP-capable systems the code is
      patched once at boot time (if needed) and the tables are released
      afterwards.
      
      The changes in detail:
      
        * The current alternatives bits are moved to a separate file,
          the SMP alternatives code is added there.
      
        * The patch adds some new elf sections to the kernel:
          .smp_altinstructions
      	like .altinstructions, also contains a list
      	of alt_instr structs.
          .smp_altinstr_replacement
      	like .altinstr_replacement, but also has some space to
      	save original instruction before replaving it.
          .smp_locks
      	list of pointers to lock prefixes which can be nop'ed
      	out on UP.
          The first two are used to replace more complex instruction
          sequences such as spinlocks and semaphores.  It would be possible
          to deal with the lock prefixes with that as well, but by handling
          them as special case the table sizes become much smaller.
      
       * The sections are page-aligned and padded up to page size, so they
         can be free if they are not needed.
      
       * Splitted the code to release init pages to a separate function and
         use it to release the elf sections if they are unused.
      Signed-off-by: NGerd Hoffmann <kraxel@suse.de>
      Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      9a0b5817
  8. 22 3月, 2006 3 次提交
    • D
      [PATCH] hugepage: is_aligned_hugepage_range() cleanup · 42b88bef
      David Gibson 提交于
      Quite a long time back, prepare_hugepage_range() replaced
      is_aligned_hugepage_range() as the callback from mm/mmap.c to arch code to
      verify if an address range is suitable for a hugepage mapping.
      is_aligned_hugepage_range() stuck around, but only to implement
      prepare_hugepage_range() on archs which didn't implement their own.
      
      Most archs (everything except ia64 and powerpc) used the same
      implementation of is_aligned_hugepage_range().  On powerpc, which
      implements its own prepare_hugepage_range(), the custom version was never
      used.
      
      In addition, "is_aligned_hugepage_range()" was a bad name, because it
      suggests it returns true iff the given range is a good hugepage range,
      whereas in fact it returns 0-or-error (so the sense is reversed).
      
      This patch cleans up by abolishing is_aligned_hugepage_range().  Instead
      prepare_hugepage_range() is defined directly.  Most archs use the default
      version, which simply checks the given region is aligned to the size of a
      hugepage.  ia64 and powerpc define custom versions.  The ia64 one simply
      checks that the range is in the correct address space region in addition to
      being suitably aligned.  The powerpc version (just as previously) checks
      for suitable addresses, and if necessary performs low-level MMU frobbing to
      set up new areas for use by hugepages.
      
      No libhugetlbfs testsuite regressions on ppc64 (POWER5 LPAR).
      Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NZhang Yanmin <yanmin.zhang@intel.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      42b88bef
    • N
      [PATCH] remove set_page_count() outside mm/ · 7835e98b
      Nick Piggin 提交于
      set_page_count usage outside mm/ is limited to setting the refcount to 1.
      Remove set_page_count from outside mm/, and replace those users with
      init_page_count() and set_page_refcounted().
      
      This allows more debug checking, and tighter control on how code is allowed
      to play around with page->_count.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7835e98b
    • N
      [PATCH] i386: pageattr remove __put_page · 84d1c054
      Nick Piggin 提交于
      Stop using __put_page and page_count in i386 pageattr.c
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      84d1c054
  9. 17 1月, 2006 1 次提交
  10. 12 1月, 2006 1 次提交
  11. 10 1月, 2006 1 次提交
  12. 07 1月, 2006 2 次提交
  13. 16 12月, 2005 1 次提交
  14. 13 12月, 2005 1 次提交
  15. 14 11月, 2005 1 次提交
  16. 31 10月, 2005 1 次提交
  17. 30 10月, 2005 4 次提交
    • D
      [PATCH] memory hotplug: i386 addition functions · 05039b92
      Dave Hansen 提交于
      Adds the necessary for non-NUMA hot-add of highmem to an existing zone on
      i386.
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      05039b92
    • D
      [PATCH] memory hotplug locking: node_size_lock · 208d54e5
      Dave Hansen 提交于
      pgdat->node_size_lock is basically only neeeded in one place in the normal
      code: show_mem(), which is the arch-specific sysrq-m printing function.
      
      Strictly speaking, the architectures not doing memory hotplug do no need this
      locking in show_mem().  However, they are all included for completeness.  This
      should also make any future consolidation of all of the implementations a
      little more straightforward.
      
      This lock is also held in the sparsemem code during a memory removal, as
      sections are invalidated.  This is the place there pfn_valid() is made false
      for a memory area that's being removed.  The lock is only required when doing
      pfn_valid() operations on memory which the user does not already have a
      reference on the page, such as in show_mem().
      Signed-off-by: NDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      208d54e5
    • H
      [PATCH] mm: split page table lock · 4c21e2f2
      Hugh Dickins 提交于
      Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
      a many-threaded application which concurrently initializes different parts of
      a large anonymous area.
      
      This patch corrects that, by using a separate spinlock per page table page, to
      guard the page table entries in that page, instead of using the mm's single
      page_table_lock.  (But even then, page_table_lock is still used to guard page
      table allocation, and anon_vma allocation.)
      
      In this implementation, the spinlock is tucked inside the struct page of the
      page table page: with a BUILD_BUG_ON in case it overflows - which it would in
      the case of 32-bit PA-RISC with spinlock debugging enabled.
      
      Splitting the lock is not quite for free: another cacheline access.  Ideally,
      I suppose we would use split ptlock only for multi-threaded processes on
      multi-cpu machines; but deciding that dynamically would have its own costs.
      So for now enable it by config, at some number of cpus - since the Kconfig
      language doesn't support inequalities, let preprocessor compare that with
      NR_CPUS.  But I don't think it's worth being user-configurable: for good
      testing of both split and unsplit configs, split now at 4 cpus, and perhaps
      change that to 8 later.
      
      There is a benefit even for singly threaded processes: kswapd can be attacking
      one part of the mm while another part is busy faulting.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4c21e2f2
    • H
      [PATCH] mm: init_mm without ptlock · 872fec16
      Hugh Dickins 提交于
      First step in pushing down the page_table_lock.  init_mm.page_table_lock has
      been used throughout the architectures (usually for ioremap): not to serialize
      kernel address space allocation (that's usually vmlist_lock), but because
      pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.
      
      Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
      architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
      and drop it when allocating a new one, to check lest a racing task already
      did.  Similarly no page_table_lock in vmalloc's map_vm_area.
      
      Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
      user mms, which are converted only by a later patch, for now they have to lock
      differently according to whether or not it's init_mm.
      
      If sources get muddled, there's a danger that an arch source taking
      init_mm.page_table_lock will be mixed with common source also taking it (or
      neither take it).  So break the rules and make another change, which should
      break the build for such a mismatch: remove the redundant mm arg from
      pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).
      
      Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
      used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
      pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
      map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
      took page_table_lock for no good reason.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      872fec16
  18. 08 9月, 2005 2 次提交
  19. 05 9月, 2005 7 次提交
    • Z
      [PATCH] i386: encapsulate copying of pgd entries · d7271b14
      Zachary Amsden 提交于
      Add a clone operation for pgd updates.
      
      This helps complete the encapsulation of updates to page tables (or pages
      about to become page tables) into accessor functions rather than using
      memcpy() to duplicate them.  This is both generally good for consistency
      and also necessary for running in a hypervisor which requires explicit
      updates to page table entries.
      
      The new function is:
      
      clone_pgd_range(pgd_t *dst, pgd_t *src, int count);
      
         dst - pointer to pgd range anwhere on a pgd page
         src - ""
         count - the number of pgds to copy.
      
         dst and src can be on the same page, but the range must not overlap
         and must not cross a page boundary.
      
      Note that I ommitted using this call to copy pgd entries into the
      software suspend page root, since this is not technically a live paging
      structure, rather it is used on resume from suspend.  CC'ing Pavel in case
      he has any feedback on this.
      
      Thanks to Chris Wright for noticing that this could be more optimal in
      PAE compiles by eliminating the memset.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d7271b14
    • Z
      [PATCH] i386: use set_pte macros in a couple places where they were missing · c9b02a24
      Zachary Amsden 提交于
      Also, setting PDPEs in PAE mode does not require atomic operations, since the
      PDPEs are cached by the processor, and only reloaded on an explicit or
      implicit reload of CR3.
      
      Since the four PDPEs must always be present in an active root, and the kernel
      PDPE is never updated, we are safe even from SMIs and interrupts / NMIs using
      task gates (which reload CR3).  Actually, much of this is moot, since the user
      PDPEs are never updated either, and the only usage of task gates is by the
      doublefault handler.  It appears the only place PGDs get updated in PAE mode
      is in init_low_mappings() / zap_low_mapping() for initial page table creation
      and recovery from ACPI sleep state, and these sites are safe by inspection.
      Getting rid of the cmpxchg8b saves code space and 720 cycles in pgd_alloc on
      P4.
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c9b02a24
    • Z
      [PATCH] i386: inline asm cleanup · 4bb0d3ec
      Zachary Amsden 提交于
      i386 Inline asm cleanup.  Use cr/dr accessor functions.
      
      Also, a potential bugfix.  Also, some CR accessors really should be volatile.
      Reads from CR0 (numeric state may change in an exception handler), writes to
      CR4 (flipping CR4.TSD) and reads from CR2 (page fault) prevent instruction
      re-ordering.  I did not add memory clobber to CR3 / CR4 / CR0 updates, as it
      was not there to begin with, and in no case should kernel memory be clobbered,
      except when doing a TLB flush, which already has memory clobber.
      
      I noticed that page invalidation does not have a memory clobber.  I can't find
      a bug as a result, but there is definitely a potential for a bug here:
      
      #define __flush_tlb_single(addr) \
      	__asm__ __volatile__("invlpg %0": :"m" (*(char *) addr))
      Signed-off-by: NZachary Amsden <zach@vmware.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4bb0d3ec
    • M
      [PATCH] x86: fix EFI memory map parsing · 7ae65fd3
      Matt Tolentino 提交于
      The memory descriptors that comprise the EFI memory map are not fixed in
      stone such that the size could change in the future.  This uses the memory
      descriptor size obtained from EFI to iterate over the memory map entries
      during boot.  This enables the removal of an x86 specific pad (and ifdef)
      in the EFI header.  I also couldn't stomach the broken up nature of the
      function to put EFI runtime calls into virtual mode any longer so I fixed
      that up a bit as well.
      
      For reference, this patch only impacts x86.
      Signed-off-by: NMatt Tolentino <matthew.e.tolentino@intel.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      7ae65fd3
    • I
      [PATCH] x86: compress the stack layout of do_page_fault() · 869f96a0
      Ingo Molnar 提交于
      This patch pushes the creation of a rare signal frame (SIGBUS or SIGSEGV)
      into a separate function, thus saving stackspace in the main
      do_page_fault() stackframe.  The effect is 132 bytes less of stack used by
      the typical do_page_fault() invocation - resulting in a denser
      cache-layout.
      
      (Another minor effect is that in case of kernel crashes that come from a
      pagefault, we add less space to the already existing frame, giving the
      crash functions a slightly higher chance to do their stuff without
      overflowing the stack.)
      
      (The changes also result in slightly cleaner code.)
      
      argument bugfix from "Guillaume C." <guichaz@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      869f96a0
    • C
      [PATCH] remove hugetlb_clean_stale_pgtable() and fix huge_pte_alloc() · 0e5c9f39
      Chen, Kenneth W 提交于
      I don't think we need to call hugetlb_clean_stale_pgtable() anymore
      in 2.6.13 because of the rework with free_pgtables().  It now collect
      all the pte page at the time of munmap.  It used to only collect page
      table pages when entire one pgd can be freed and left with staled pte
      pages.  Not anymore with 2.6.13.  This function will never be called
      and We should turn it into a BUG_ON.
      
      I also spotted two problems here, not Adam's fault :-)
      (1) in huge_pte_alloc(), it looks like a bug to me that pud is not
          checked before calling pmd_alloc()
      (2) in hugetlb_clean_stale_pgtable(), it also missed a call to
          pmd_free_tlb.  I think a tlb flush is required to flush the mapping
          for the page table itself when we clear out the pmd pointing to a
          pte page.  However, since hugetlb_clean_stale_pgtable() is never
          called, so it won't trigger the bug.
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Cc: Adam Litke <agl@us.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      0e5c9f39
    • A
      [PATCH] hugetlb: check p?d_present in huge_pte_offset() · 02b0ccef
      Adam Litke 提交于
      For demand faulting, we cannot assume that the page tables will be
      populated.  Do what the rest of the architectures do and test p?d_present()
      while walking down the page table.
      Signed-off-by: NAdam Litke <agl@us.ibm.com>
      Cc: <linux-mm@kvack.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      02b0ccef