1. 10 8月, 2010 6 次提交
  2. 05 8月, 2010 1 次提交
    • J
      mm,kdb,kgdb: Add a debug reference for the kdb kmap usage · eac79005
      Jason Wessel 提交于
      The kdb kmap should never get used outside of the kernel debugger
      exception context.
      
      Signed-off-by: Jason Wessel<jason.wessel@windriver.com>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Ingo Molnar <mingo@elte.hu>
      CC: linux-mm@kvack.org
      eac79005
  3. 03 8月, 2010 2 次提交
  4. 01 8月, 2010 2 次提交
  5. 31 7月, 2010 1 次提交
    • H
      mm: fix ia64 crash when gcore reads gate area · de51257a
      Hugh Dickins 提交于
      Debian's ia64 autobuilders have been seeing kernel freeze or reboot
      when running the gdb testsuite (Debian bug 588574): dannf bisected to
      2.6.32 62eede62 "mm: ZERO_PAGE without
      PTE_SPECIAL"; and reproduced it with gdb's gcore on a simple target.
      
      I'd missed updating the gate_vma handling in __get_user_pages(): that
      happens to use vm_normal_page() (nowadays failing on the zero page),
      yet reported success even when it failed to get a page - boom when
      access_process_vm() tried to copy that to its intermediate buffer.
      
      Fix this, resisting cleanups: in particular, leave it for now reporting
      success when not asked to get any pages - very probably safe to change,
      but let's not risk it without testing exposure.
      
      Why did ia64 crash with 16kB pages, but succeed with 64kB pages?
      Because setup_gate() pads each 64kB of its gate area with zero pages.
      Reported-by: NAndreas Barth <aba@not.so.argh.org>
      Bisected-by: Ndann frazier <dannf@debian.org>
      Signed-off-by: NHugh Dickins <hughd@google.com>
      Tested-by: Ndann frazier <dannf@dannf.org>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      de51257a
  6. 29 7月, 2010 1 次提交
    • C
      slub numa: Fix rare allocation from unexpected node · bc6488e9
      Christoph Lameter 提交于
      The network developers have seen sporadic allocations resulting in objects
      coming from unexpected NUMA nodes despite asking for objects from a
      specific node.
      
      This is due to get_partial() calling get_any_partial() if partial
      slabs are exhausted for a node even if a node was specified and therefore
      one would expect allocations only from the specified node.
      
      get_any_partial() sporadically may return a slab from a foreign
      node to gradually reduce the size of partial lists on remote nodes
      and thereby reduce total memory use for a slab cache.
      
      The behavior is controlled by the remote_defrag_ratio of each cache.
      
      Strictly speaking this is permitted behavior since __GFP_THISNODE was
      not specified for the allocation but it is certain surprising.
      
      This patch makes sure that the remote defrag behavior only occurs
      if no node was specified.
      Signed-off-by: NChristoph Lameter <cl@linux-foundation.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      bc6488e9
  7. 21 7月, 2010 2 次提交
    • Y
      x86,nobootmem: make alloc_bootmem_node fall back to other node when 32bit numa is used · b8ab9f82
      Yinghai Lu 提交于
      Borislav Petkov reported his 32bit numa system has problem:
      
      [    0.000000] Reserving total of 4c00 pages for numa KVA remap
      [    0.000000] kva_start_pfn ~ 32800 max_low_pfn ~ 375fe
      [    0.000000] max_pfn = 238000
      [    0.000000] 8202MB HIGHMEM available.
      [    0.000000] 885MB LOWMEM available.
      [    0.000000]   mapped low ram: 0 - 375fe000
      [    0.000000]   low ram: 0 - 375fe000
      [    0.000000] alloc (nid=8 100000 - 7ee00000) (1000000 - ffffffff) 1000 1000 => 34e7000
      [    0.000000] alloc (nid=8 100000 - 7ee00000) (1000000 - ffffffff) 200 40 => 34c9d80
      [    0.000000] alloc (nid=0 100000 - 7ee00000) (1000000 - ffffffffffffffff) 180 40 => 34e6140
      [    0.000000] alloc (nid=1 80000000 - c7e60000) (1000000 - ffffffffffffffff) 240 40 => 80000000
      [    0.000000] BUG: unable to handle kernel paging request at 40000000
      [    0.000000] IP: [<c2c8cff1>] __alloc_memory_core_early+0x147/0x1d6
      [    0.000000] *pdpt = 0000000000000000 *pde = f000ff53f000ff00
      ...
      [    0.000000] Call Trace:
      [    0.000000]  [<c2c8b4f8>] ? __alloc_bootmem_node+0x216/0x22f
      [    0.000000]  [<c2c90c9b>] ? sparse_early_usemaps_alloc_node+0x5a/0x10b
      [    0.000000]  [<c2c9149e>] ? sparse_init+0x1dc/0x499
      [    0.000000]  [<c2c79118>] ? paging_init+0x168/0x1df
      [    0.000000]  [<c2c780ff>] ? native_pagetable_setup_start+0xef/0x1bb
      
      looks like it allocates too much high address for bootmem.
      
      Try to cut limit with get_max_mapped()
      Reported-by: NBorislav Petkov <borislav.petkov@amd.com>
      Tested-by: NConny Seidel <conny.seidel@amd.com>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Cc: <stable@kernel.org>		[2.6.34.x]
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b8ab9f82
    • N
      mm/vmscan.c: fix mapping use after free · a6aa62a0
      Nick Piggin 提交于
      We need lock_page_nosync() here because we have no reference to the
      mapping when taking the page lock.
      Signed-off-by: NNick Piggin <npiggin@suse.de>
      Reviewed-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a6aa62a0
  8. 20 7月, 2010 1 次提交
    • A
      slab: use deferable timers for its periodic housekeeping · 78b43536
      Arjan van de Ven 提交于
      slab has a "once every 2 second" timer for its housekeeping.
      As the number of logical processors is growing, its more and more
      common that this 2 second timer becomes the primary wakeup source.
      
      This patch turns this housekeeping timer into a deferable timer,
      which means that the timer does not interrupt idle, but just runs
      at the next event that wakes the cpu up.
      
      The impact is that the timer likely runs a bit later, but during the
      delay no code is running so there's not all that much reason for
      a difference in housekeeping to occur because of this delay.
      Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      78b43536
  9. 19 7月, 2010 3 次提交
  10. 16 7月, 2010 6 次提交
  11. 14 7月, 2010 1 次提交
  12. 10 7月, 2010 1 次提交
    • K
      x86, ioremap: Fix incorrect physical address handling in PAE mode · ffa71f33
      Kenji Kaneshige 提交于
      Current x86 ioremap() doesn't handle physical address higher than
      32-bit properly in X86_32 PAE mode. When physical address higher than
      32-bit is passed to ioremap(), higher 32-bits in physical address is
      cleared wrongly. Due to this bug, ioremap() can map wrong address to
      linear address space.
      
      In my case, 64-bit MMIO region was assigned to a PCI device (ioat
      device) on my system. Because of the ioremap()'s bug, wrong physical
      address (instead of MMIO region) was mapped to linear address space.
      Because of this, loading ioatdma driver caused unexpected behavior
      (kernel panic, kernel hangup, ...).
      Signed-off-by: NKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
      LKML-Reference: <4C1AE680.7090408@jp.fujitsu.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      ffa71f33
  13. 06 7月, 2010 2 次提交
    • C
      writeback: simplify the write back thread queue · 83ba7b07
      Christoph Hellwig 提交于
      First remove items from work_list as soon as we start working on them.  This
      means we don't have to track any pending or visited state and can get
      rid of all the RCU magic freeing the work items - we can simply free
      them once the operation has finished.  Second use a real completion for
      tracking synchronous requests - if the caller sets the completion pointer
      we complete it, otherwise use it as a boolean indicator that we can free
      the work item directly.  Third unify struct wb_writeback_args and struct
      bdi_work into a single data structure, wb_writeback_work.  Previous we
      set all parameters into a struct wb_writeback_args, copied it into
      struct bdi_work, copied it again on the stack to use it there.  Instead
      of just allocate one structure dynamically or on the stack and use it
      all the way through the stack.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      83ba7b07
    • C
      writeback: remove writeback_inodes_wbc · 9c3a8ee8
      Christoph Hellwig 提交于
      This was just an odd wrapper around writeback_inodes_wb.  Removing this
      also allows to get rid of the bdi member of struct writeback_control
      which was rather out of place there.
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NJens Axboe <jaxboe@fusionio.com>
      9c3a8ee8
  14. 30 6月, 2010 2 次提交
  15. 28 6月, 2010 2 次提交
    • T
      percpu: allow limited allocation before slab is online · 099a19d9
      Tejun Heo 提交于
      This patch updates percpu allocator such that it can serve limited
      amount of allocation before slab comes online.  This is primarily to
      allow slab to depend on working percpu allocator.
      
      Two parameters, PERCPU_DYNAMIC_EARLY_SIZE and SLOTS, determine how
      much memory space and allocation map slots are reserved.  If this
      reserved area is exhausted, WARN_ON_ONCE() will trigger and allocation
      will fail till slab comes online.
      
      The following changes are made to implement early alloc.
      
      * pcpu_mem_alloc() now checks slab_is_available()
      
      * Chunks are allocated using pcpu_mem_alloc()
      
      * Init paths make sure ai->dyn_size is at least as large as
        PERCPU_DYNAMIC_EARLY_SIZE.
      
      * Initial alloc maps are allocated in __initdata and copied to
        kmalloc'd areas once slab is online.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      099a19d9
    • T
      percpu: make @dyn_size always mean min dyn_size in first chunk init functions · 4ba6ce25
      Tejun Heo 提交于
      In pcpu_build_alloc_info() and pcpu_embed_first_chunk(), @dyn_size was
      ssize_t, -1 meant auto-size, 0 forced 0 and positive meant minimum
      size.  There's no use case for forcing 0 and the upcoming early alloc
      support always requires non-zero dynamic size.  Make @dyn_size always
      mean minimum dyn_size.
      
      While at it, make pcpu_build_alloc_info() static which doesn't have
      any external caller as suggested by David Rientjes.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      4ba6ce25
  16. 18 6月, 2010 1 次提交
    • T
      percpu: fix first chunk match in per_cpu_ptr_to_phys() · 9983b6f0
      Tejun Heo 提交于
      per_cpu_ptr_to_phys() determines whether the passed in @addr belongs
      to the first_chunk or not by just matching the address against the
      address range of the base unit (unit0, used by cpu0).  When an adress
      from another cpu was passed in, it will always determine that the
      address doesn't belong to the first chunk even when it does.  This
      makes the function return a bogus physical address which may lead to
      crash.
      
      This problem was discovered by Cliff Wickman while investigating a
      crash during kdump on a SGI UV system.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NCliff Wickman <cpw@sgi.com>
      Tested-by: NCliff Wickman <cpw@sgi.com>
      Cc: stable@kernel.org
      9983b6f0
  17. 17 6月, 2010 1 次提交
  18. 15 6月, 2010 1 次提交
  19. 11 6月, 2010 1 次提交
  20. 09 6月, 2010 3 次提交