1. 31 10月, 2005 8 次提交
  2. 30 10月, 2005 2 次提交
    • H
      [PATCH] mm: init_mm without ptlock · 872fec16
      Hugh Dickins 提交于
      First step in pushing down the page_table_lock.  init_mm.page_table_lock has
      been used throughout the architectures (usually for ioremap): not to serialize
      kernel address space allocation (that's usually vmlist_lock), but because
      pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.
      
      Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
      architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
      and drop it when allocating a new one, to check lest a racing task already
      did.  Similarly no page_table_lock in vmalloc's map_vm_area.
      
      Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
      user mms, which are converted only by a later patch, for now they have to lock
      differently according to whether or not it's init_mm.
      
      If sources get muddled, there's a danger that an arch source taking
      init_mm.page_table_lock will be mixed with common source also taking it (or
      neither take it).  So break the rules and make another change, which should
      break the build for such a mismatch: remove the redundant mm arg from
      pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).
      
      Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
      used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
      pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
      map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
      took page_table_lock for no good reason.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      872fec16
    • H
      [PATCH] mm: mm_init set_mm_counters · 404351e6
      Hugh Dickins 提交于
      How is anon_rss initialized?  In dup_mmap, and by mm_alloc's memset; but
      that's not so good if an mm_counter_t is a special type.  And how is rss
      initialized?  By set_mm_counter, all over the place.  Come on, we just need to
      initialize them both at once by set_mm_counter in mm_init (which follows the
      memcpy when forking).
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      404351e6
  3. 28 10月, 2005 1 次提交
  4. 11 10月, 2005 2 次提交
  5. 10 10月, 2005 2 次提交
    • M
      [PATCH] i386: fix stack alignment for signal handlers · d347f372
      Markus F.X.J. Oberhumer 提交于
      This fixes the setup of the alignment of the signal frame, so that all
      signal handlers are run with a properly aligned stack frame.
      
      The current code "over-aligns" the stack pointer so that the stack frame
      is effectively always mis-aligned by 4 bytes.  But what we really want
      is that on function entry ((sp + 4) & 15) == 0, which matches what would
      happen if the stack were aligned before a "call" instruction.
      Signed-off-by: NMarkus F.X.J. Oberhumer <markus@oberhumer.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      d347f372
    • R
      [PATCH] x86_64: Set up safe page tables during resume · 3dd08325
      Rafael J. Wysocki 提交于
      The following patch makes swsusp avoid the possible temporary corruption
      of page translation tables during resume on x86-64.  This is achieved by
      creating a copy of the relevant page tables that will not be modified by
      swsusp and can be safely used by it on resume.
      
      The problem is that during resume on x86-64 swsusp may temporarily
      corrupt the page tables used for the direct mapping of RAM.  If that
      happens, a page fault occurs and cannot be handled properly, which leads
      to the solid hang of the affected system.  This leads to the loss of the
      system's state from before suspend and may result in the loss of data or
      the corruption of filesystems, so it is a serious issue.  Also, it
      appears to happen quite often (for me, as often as 50% of the time).
      
      The problem is related to the fact that (at least) one of the PMD
      entries used in the direct memory mapping (starting at PAGE_OFFSET)
      points to a page table the physical address of which is much greater
      than the physical address of the PMD entry itself.  Moreover,
      unfortunately, the physical address of the page table before suspend
      (i.e.  the one stored in the suspend image) happens to be different to
      the physical address of the corresponding page table used during resume
      (i.e.  the one that is valid right before swsusp_arch_resume() in
      arch/x86_64/kernel/suspend_asm.S is executed).  Thus while the image is
      restored, the "offending" PMD entry gets overwritten, so it does not
      point to the right physical address any more (i.e.  there's no page
      table at the address pointed to by it, because it points to the address
      the page table has been at during suspend).  Consequently, if the PMD
      entry is used later on, and it _is_ used in the process of copying the
      image pages, a page fault occurs, but it cannot be handled in the normal
      way and the system hangs.
      
      In principle we can call create_resume_mapping() from
      swsusp_arch_resume() (ie.  from suspend_asm.S), but then the memory
      allocations in create_resume_mapping(), resume_pud_mapping(), and
      resume_pmd_mapping() must be made carefully so that we use _only_
      NosaveFree pages in them (the other pages are overwritten by the loop in
      swsusp_arch_resume()).  Additionally, we are in atomic context at that
      time, so we cannot use GFP_KERNEL.  Moreover, if one of the allocations
      fails, we should free all of the allocated pages, so we need to trace
      them somehow.
      
      All of this is done in the appended patch, except that the functions
      populating the page tables are located in arch/x86_64/kernel/suspend.c
      rather than in init.c.  It may be done in a more elegan way in the
      future, with the help of some swsusp patches that are in the works now.
      
      [AK: move some externs into headers, renamed a function]
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      3dd08325
  6. 05 10月, 2005 1 次提交
    • A
      [PATCH] x86_64: Drop global bit from early low mappings · 944d2647
      Andi Kleen 提交于
      Drop global bit from early low mappings
      
      Suggested by Linus, originally also proposed by Suresh.
      
      This fixes a race condition with early start of udev, originally
      tracked down by Suresh B. Siddha. The problem was that switching
      to the user space VM would not clear the global low mappings
      for the beginning of memory, which lead to memory corruption.
      
      Drop the global bits.
      
      The kernel mapping stays global because it should stay constant.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      944d2647
  7. 04 10月, 2005 1 次提交
  8. 01 10月, 2005 3 次提交
    • R
      [PATCH] x86_64 early numa init fix · 85cc5135
      Ravikiran G Thirumalai 提交于
      The tests Alok carried out on Petr's box confirmed that cpu_to_node[BP] is
      not setup early enough by numa_init_array due to the x86_64 changes in
      2.6.14-rc*, and unfortunately set wrongly by the work around code in
      numa_init_array().  cpu_to_node[0] gets set with 1 early and later gets set
      properly to 0 during identify_cpu() when all cpus are brought up, but
      confusing the numa slab in the process.
      
      Here is a quick fix for this.  The right fix obviously is to have
      cpu_to_node[bsp] setup early for numa_init_array().  The following patch
      will fix the problem now, and the code can stay on even when
      cpu_to_node{BP] gets fixed early correctly.
      
      Thanks to Petr for access to his box.
      
      Signed off by: Ravikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: NAlok N Kataria <alokk@calsoftinc.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      85cc5135
    • R
      [PATCH] x86_64: fix the BP node_to_cpumask · e6a045a5
      Ravikiran G Thirumalai 提交于
      Fix the BP node_to_cpumask.  2.6.14-rc* broke the boot cpu bit as the
      cpu_to_node(0) is now not setup early enough for numa_init_array.
      cpu_to_node[] is setup much later at srat_detect_node on acpi srat based
      em64t machines.  This seems like a problem on amd machines too, Tested on
      em64t though.  /sys/devices/system/node/node0/cpumap shows up sanely after
      this patch.
      
      Signed off by: Ravikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: NShai Fultheim <shai@scalex86.org>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      e6a045a5
    • Z
      [PATCH] utilization of kprobe_mutex is incorrect on x86_64 · 2dd960d6
      Zhang, Yanmin 提交于
      The up()/down() orders are incorrect in arch/x86_64/kprobes.c file.
      kprobe_mutext is used to protect the free kprobe instruction slot list.
      arch_prepare_kprobe applies for a slot from the free list, and
      arch_remove_kprobe returns a slot to the free list.  The incorrect up()/down()
      orders to operate on kprobe_mutex fail to protect the free list.  If 2 threads
      try to get/return kprobe instruction slot at the same time, the free slot list
      might be broken, or a free slot might be applied by 2 threads.
      Signed-off-by: NZhang Yanmin <Yanmin.zhang@intel.com>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2dd960d6
  9. 30 9月, 2005 2 次提交
  10. 28 9月, 2005 1 次提交
  11. 22 9月, 2005 1 次提交
  12. 18 9月, 2005 2 次提交
  13. 15 9月, 2005 3 次提交
    • D
      [LIB]: Consolidate _atomic_dec_and_lock() · 4db2ce01
      David S. Miller 提交于
      Several implementations were essentialy a common piece of C code using
      the cmpxchg() macro.  Put the implementation in one spot that everyone
      can share, and convert sparc64 over to using this.
      
      Alpha is the lone arch-specific implementation, which codes up a
      special fast path for the common case in order to avoid GP reloading
      which a pure C version would require.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4db2ce01
    • L
      Partially revert "Fix time going twice as fast problem on ATI Xpress chipsets" · 1619cca2
      Linus Torvalds 提交于
      Commit 66759a01 introduced the fix for
      time ticking too fast on some boards by disabling one of the doubly
      connected timer pins on ATI boards.
      
      However, it ends up being _much_ too broad a brush, and that just makes
      some other ATI boards not work at all since they now have no timer
      source.
      
      So disable the automatic ATI southbridge detection, and just rely on
      people who see this problem disabling it by hand with the option
      "disable_timer_pin_1" on the kernel command line.
      
      Maybe somebody can figure out the proper tests at a later date.
      Acked-by: NPeter Osterlund <petero2@telia.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1619cca2
    • H
      [PATCH] error path in setup_arg_pages() misses vm_unacct_memory() · 2fd4ef85
      Hugh Dickins 提交于
      Pavel Emelianov and Kirill Korotaev observe that fs and arch users of
      security_vm_enough_memory tend to forget to vm_unacct_memory when a
      failure occurs further down (typically in setup_arg_pages variants).
      
      These are all users of insert_vm_struct, and that reservation will only
      be unaccounted on exit if the vma is marked VM_ACCOUNT: which in some
      cases it is (hidden inside VM_STACK_FLAGS) and in some cases it isn't.
      
      So x86_64 32-bit and ppc64 vDSO ELFs have been leaking memory into
      Committed_AS each time they're run.  But don't add VM_ACCOUNT to them,
      it's inappropriate to reserve against the very unlikely case that gdb
      be used to COW a vDSO page - we ought to do something about that in
      do_wp_page, but there are yet other inconsistencies to be resolved.
      
      The safe and economical way to fix this is to let insert_vm_struct do
      the security_vm_enough_memory check when it finds VM_ACCOUNT is set.
      
      And the MIPS irix_brk has been calling security_vm_enough_memory before
      calling do_brk which repeats it, doubly accounting and so also leaking.
      Remove that, and all the fs and arch calls to security_vm_enough_memory:
      give it a less misleading name later on.
      Signed-off-by: NHugh Dickins <hugh@veritas.com>
      Signed-Off-By: NKirill Korotaev <dev@sw.ru>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      2fd4ef85
  14. 14 9月, 2005 1 次提交
  15. 13 9月, 2005 10 次提交