- 18 5月, 2009 1 次提交
-
-
由 Yinghai Lu 提交于
after: | commit b263295d | Author: Christoph Lameter <clameter@sgi.com> | Date: Wed Jan 30 13:30:47 2008 +0100 | | x86: 64-bit, make sparsemem vmemmap the only memory model we don't have MEMORY_HOTPLUG_RESERVE anymore. Historically, x86-64 had an architecture-specific method for memory hotplug whereby it scanned the SRAT for physical memory ranges that could be potentially used for memory hot-add later. By reserving those ranges without physical memory, the memmap would be allocated and left dormant until needed. This depended on the DISCONTIG memory model which has been removed so the code implementing HOTPLUG_RESERVE is now dead. This patch removes the dead code used by MEMORY_HOTPLUG_RESERVE. (Changelog authored by Mel.) v2: updated changelog, and remove hotadd= in doc [ Impact: remove dead code ] Signed-off-by: NYinghai Lu <yinghai@kernel.org> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Reviewed-by: NMel Gorman <mel@csn.ul.ie> Workflow-found-OK-by: NAndrew Morton <akpm@linux-foundation.org> LKML-Reference: <4A0C4910.7090508@kernel.org> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 15 5月, 2009 1 次提交
-
-
由 Jens Axboe 提交于
This reverts commit fafd688e. Work is progressing to switch away from pdflush as the process backing for flushing out dirty data. So it seems pointless to add more knobs to control pdflush threads. The original author of the patch did not have any specific use cases for adding the knobs, so we can easily revert this before 2.6.30 to avoid having to maintain this API forever. Signed-off-by: NJens Axboe <jens.axboe@oracle.com>
-
- 13 5月, 2009 1 次提交
-
-
由 Linus Torvalds 提交于
This reverts commit a425a638. Now that the previous commit removed the "readpage" actor for hugetlb files, read-ahead will no longer mess up the mapping, and there's no longer any reason to treat hugetlbfs mappings specially. Tested-and-acked-by: NMel Gorman <mel@csn.ul.ie> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 5月, 2009 1 次提交
-
-
由 David Howells 提交于
Don't check vm_region::vm_start is page aligned in add_nommu_region() because the region may reflect some non-page-aligned mapped file, such as could be obtained from RomFS XIP. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NGreg Ungerer <gerg@uclinux.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 5月, 2009 5 次提交
-
-
由 David Howells 提交于
NOMMU mmap() has an option controlled by a sysctl variable that determines whether the allocations made by do_mmap_private() should have the excess space trimmed off and returned to the allocator. Make the initial setting of this variable a Kconfig configuration option. The reason there can be excess space is that the allocator only allocates in power-of-2 size chunks, but mmap()'s can be made in sizes that aren't a power of 2. There are two alternatives: (1) Keep the excess as dead space. The dead space then remains unused for the lifetime of the mapping. Mappings of shared objects such as libc, ld.so or busybox's text segment may retain their dead space forever. (2) Return the excess to the allocator. This means that the dead space is limited to less than a page per mapping, but it means that for a transient process, there's more chance of fragmentation as the excess space may be reused fairly quickly. During the boot process, a lot of transient processes are created, and this can cause a lot of fragmentation as the pagecache and various slabs grow greatly during this time. By turning off the trimming of excess space during boot and disabling batching of frees, Coldfire can manage to boot. A better way of doing things might be to have /sbin/init turn this option off. By that point libc, ld.so and init - which are all long-duration processes - have all been loaded and trimmed. Reported-by: NLanttor Guo <lanttor.guo@freescale.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> Tested-by: NLanttor Guo <lanttor.guo@freescale.com> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Howells 提交于
Clamp zone_batchsize() to 0 under NOMMU conditions to stop free_hot_cold_page() from queueing and batching frees. The problem is that under NOMMU conditions it is really important to be able to allocate large contiguous chunks of memory, but when munmap() or exit_mmap() releases big stretches of memory, return of these to the buddy allocator can be deferred, and when it does finally happen, it can be in small chunks. Whilst the fragmentation this incurs isn't so much of a problem under MMU conditions as userspace VM is glued together from individual pages with the aid of the MMU, it is a real problem if there isn't an MMU. By clamping the page freeing queue size to 0, pages are returned to the allocator immediately, and the buddy detector is more likely to be able to glue them together into large chunks immediately, and fragmentation is less likely to occur. By disabling batching of frees, and by turning off the trimming of excess space during boot, Coldfire can manage to boot. Reported-by: NLanttor Guo <lanttor.guo@freescale.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> Tested-by: NLanttor Guo <lanttor.guo@freescale.com> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Howells 提交于
Use roundown_pow_of_two(N) in zone_batchsize() rather than (1 << (fls(N)-1)) as they are equivalent, and with the former it is easier to see what is going on. Signed-off-by: NDavid Howells <dhowells@redhat.com> Tested-by: NLanttor Guo <lanttor.guo@freescale.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ralph Wuerthner 提交于
If alloc_vmap_area() fails the allocated struct vmap_area has to be freed. Signed-off-by: NRalph Wuerthner <ralphw@linux.vnet.ibm.com> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Rientjes 提交于
When /proc/sys/vm/oom_kill_allocating_task is set for large systems that want to avoid the lengthy tasklist scan, it's possible to livelock if current is ineligible for oom kill. This normally happens when it is set to OOM_DISABLE, but is also possible if any threads are sharing the same ->mm with a different tgid. So change __out_of_memory() to fall back to the full task-list scan if it was unable to kill `current'. Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: NDavid Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 5月, 2009 1 次提交
-
-
由 Mel Gorman 提交于
madvise(MADV_WILLNEED) forces page cache readahead on a range of memory backed by a file. The assumption is made that the page required is order-0 and "normal" page cache. On hugetlbfs, this assumption is not true and order-0 pages are allocated and inserted into the hugetlbfs page cache. This leaks hugetlbfs page reservations and can cause BUGs to trigger related to corrupted page tables. This patch causes MADV_WILLNEED to be ignored for hugetlbfs-backed regions. Signed-off-by: NMel Gorman <mel@csn.ul.ie> Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 5月, 2009 6 次提交
-
-
由 Andrew Morton 提交于
Local variable `scan' can overflow on zones which are larger than (2G * 4k) / 100 = 80GB. Making it 64-bit on 64-bit will fix that up. Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KOSAKI Motohiro 提交于
The Committed_AS field can underflow in certain situations: > # while true; do cat /proc/meminfo | grep _AS; sleep 1; done | uniq -c > 1 Committed_AS: 18446744073709323392 kB > 11 Committed_AS: 18446744073709455488 kB > 6 Committed_AS: 35136 kB > 5 Committed_AS: 18446744073709454400 kB > 7 Committed_AS: 35904 kB > 3 Committed_AS: 18446744073709453248 kB > 2 Committed_AS: 34752 kB > 9 Committed_AS: 18446744073709453248 kB > 8 Committed_AS: 34752 kB > 3 Committed_AS: 18446744073709320960 kB > 7 Committed_AS: 18446744073709454080 kB > 3 Committed_AS: 18446744073709320960 kB > 5 Committed_AS: 18446744073709454080 kB > 6 Committed_AS: 18446744073709320960 kB Because NR_CPUS can be greater than 1000 and meminfo_proc_show() does not check for underflow. But NR_CPUS proportional isn't good calculation. In general, possibility of lock contention is proportional to the number of online cpus, not theorical maximum cpus (NR_CPUS). The current kernel has generic percpu-counter stuff. using it is right way. it makes code simplify and percpu_counter_read_positive() don't make underflow issue. Reported-by: NDave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Eric B Munson <ebmunson@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: <stable@kernel.org> [All kernel versions] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Daisuke Nishimura 提交于
Current mem_cgroup_shrink_usage() has two problems. 1. It doesn't call mem_cgroup_out_of_memory and doesn't update last_oom_jiffies, so pagefault_out_of_memory invokes global OOM. 2. Considering hierarchy, shrinking has to be done from the mem_over_limit, not from the memcg which the page would be charged to. mem_cgroup_try_charge_swapin() does all of these things properly, so we use it and call cancel_charge_swapin when it succeeded. The name of "shrink_usage" is not appropriate for this behavior, so we change it too. Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.cn> Cc: Paul Menage <menage@google.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nick Piggin 提交于
Change page_mkwrite to allow implementations to return with the page locked, and also change it's callers (in page fault paths) to hold the lock until the page is marked dirty. This allows the filesystem to have full control of page dirtying events coming from the VM. Rather than simply hold the page locked over the page_mkwrite call, we call page_mkwrite with the page unlocked and allow callers to return with it locked, so filesystems can avoid LOR conditions with page lock. The problem with the current scheme is this: a filesystem that wants to associate some metadata with a page as long as the page is dirty, will perform this manipulation in its ->page_mkwrite. It currently then must return with the page unlocked and may not hold any other locks (according to existing page_mkwrite convention). In this window, the VM could write out the page, clearing page-dirty. The filesystem has no good way to detect that a dirty pte is about to be attached, so it will happily write out the page, at which point, the filesystem may manipulate the metadata to reflect that the page is no longer dirty. It is not always possible to perform the required metadata manipulation in ->set_page_dirty, because that function cannot block or fail. The filesystem may need to allocate some data structure, for example. And the VM cannot mark the pte dirty before page_mkwrite, because page_mkwrite is allowed to fail, so we must not allow any window where the page could be written to if page_mkwrite does fail. This solution of holding the page locked over the 3 critical operations (page_mkwrite, setting the pte dirty, and finally setting the page dirty) closes out races nicely, preventing page cleaning for writeout being initiated in that window. This provides the filesystem with a strong synchronisation against the VM here. - Sage needs this race closed for ceph filesystem. - Trond for NFS (http://bugzilla.kernel.org/show_bug.cgi?id=12913). - I need it for fsblock. - I suspect other filesystems may need it too (eg. btrfs). - I have converted buffer.c to the new locking. Even simple block allocation under dirty pages might be susceptible to i_size changing under partial page at the end of file (we also have a buffer.c-side problem here, but it cannot be fixed properly without this patch). - Other filesystems (eg. NFS, maybe btrfs) will need to change their page_mkwrite functions themselves. [ This also moves page_mkwrite another step closer to fault, which should eventually allow page_mkwrite to be moved into ->fault, and thus avoiding a filesystem calldown and page lock/unlock cycle in __do_fault. ] [akpm@linux-foundation.org: fix derefs of NULL ->mapping] Cc: Sage Weil <sage@newdream.net> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NNick Piggin <npiggin@suse.de> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Daisuke Nishimura 提交于
This is a bugfix for commit 3c776e64 ("memcg: charge swapcache to proper memcg"). Used bit of swapcache is solid under page lock, but considering move_account, pc->mem_cgroup is not. We need lock_page_cgroup() anyway. Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Johannes Weiner 提交于
By the time the memory cgroup code is notified about a swapin we already hold a reference on the fault page. If the cgroup callback fails make sure to unlock AND release the page reference which was taken by lookup_swap_cach(), or we leak the reference. Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Reviewed-by: NMinchan Kim <minchan.kim@gmail.com> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 22 4月, 2009 1 次提交
-
-
由 KOSAKI Motohiro 提交于
Commit a6dc60f8 ("vmscan: rename sc.may_swap to may_unmap") removed the may_swap flag, but memcg had used it as a flag for "we need to use swap?", as the name indicate. And in the current implementation, memcg cannot reclaim mapped file caches when mem+swap hits the limit. re-introduce may_swap flag and handle it at get_scan_ratio(). This patch doesn't influence any scan_control users other than memcg. Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 19 4月, 2009 1 次提交
-
-
由 Rafael J. Wysocki 提交于
Commit d979677c ("mm: shrink_all_memory(): use sc.nr_reclaimed") broke the memory shrinking used by hibernation, becuse it did not update shrink_all_zones() in accordance with the other changes it made. Fix this by making shrink_all_zones() update sc->nr_reclaimed instead of overwriting its value. This fixes http://bugzilla.kernel.org/show_bug.cgi?id=13058Reported-and-tested-by: NAlan Jenkins <alan-jenkins@tuffmail.co.uk> Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 4月, 2009 1 次提交
-
-
由 Hugh Dickins 提交于
Tetsuo Handa reports seeing the WARN_ON(current->mm == NULL) in security_vm_enough_memory(), when do_execve() is touching the target mm's stack, to set up its args and environment. Yes, a UMH_NO_WAIT or UMH_WAIT_PROC call_usermodehelper() spawns an mm-less kernel thread to do the exec. And in any case, that vm_enough_memory check when growing stack ought to be done on the target mm, not on the execer's mm (though apart from the warning, it only makes a slight tweak to OVERCOMMIT_NEVER behaviour). Reported-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: NHugh Dickins <hugh@veritas.com> Cc: stable@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 4月, 2009 1 次提交
-
-
由 Chris Mason 提交于
This wasn't exported before and is useful (used by the experimental ext3 data=guarded code) Signed-off-by: NChris Mason <chris.mason@oracle.com> Acked-by: NTheodore Tso <tytso@mit.edu> Acked-by: NJan Kara <jack@suse.cz> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 4月, 2009 6 次提交
-
-
由 Hugh Dickins 提交于
SHMEM_MAX_BYTES was derived from the maximum size of its triple-indirect swap vector, forgetting to take the MAX_LFS_FILESIZE limit into account. Never mind 256kB pages, even 8kB pages on 32-bit kernels allowed files to grow slightly bigger than that supposed maximum. Fix this by using the min of both (at build time not run time). And it happens that this calculation is good as far as 8MB pages on 32-bit or 16MB pages on 64-bit: though SHMSWP_MAX_INDEX gets truncated before that, it's truncated to such large numbers that we don't need to care. [akpm@linux-foundation.org: it needs pagemap.h] [akpm@linux-foundation.org: fix sparc64 min() warnings] Signed-off-by: NHugh Dickins <hugh@veritas.com> Cc: Yuri Tikhonov <yur@emcraft.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yuri Tikhonov 提交于
Fix a division by zero which we have in shmem_truncate_range() and shmem_unuse_inode() when using big PAGE_SIZE values (e.g. 256kB on ppc44x). With 256kB PAGE_SIZE, the ENTRIES_PER_PAGEPAGE constant becomes too large (0x1.0000.0000) on a 32-bit kernel, so this patch just changes its type from 'unsigned long' to 'unsigned long long'. Hugh: reverted its unsigned long longs in shmem_truncate_range() and shmem_getpage(): the pagecache index cannot be more than an unsigned long, so the divisions by zero occurred in unreached code. It's a pity we need any ULL arithmetic here, but I found no pretty way to avoid it. Signed-off-by: NYuri Tikhonov <yur@emcraft.com> Signed-off-by: NHugh Dickins <hugh@veritas.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KAMEZAWA Hiroyuki 提交于
mm/memcontrol.c:318: warning: `mem_cgroup_is_obsolete' defined but not used [akpm@linux-foundation.org: simplify as suggested by Balbir] Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: NBalbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andy Grover 提交于
While better than get_user_pages(), the usage of gupf(), especially the return values and the fact that it can potentially only partially pin the range, warranted some documentation. Signed-off-by: NAndy Grover <andy.grover@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Howells 提交于
Point the UNEVICTABLE_LRU config option at the documentation describing the option. Signed-off-by: NDavid Howells <dhowells@redhat.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Randy Dunlap 提交于
Fix filemap.c kernel-doc warnings: Warning(mm/filemap.c:575): No description found for parameter 'page' Warning(mm/filemap.c:575): No description found for parameter 'waiter' Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 07 4月, 2009 3 次提交
-
-
由 Peter W Morreale 提交于
Add /proc entries to give the admin the ability to control the minimum and maximum number of pdflush threads. This allows finer control of pdflush on both large and small machines. The rationale is simply one size does not fit all. Admins on large and/or small systems may want to tune the min/max pdflush thread count to best suit their needs. Right now the min/max is hardcoded to 2/8. While probably a fair estimate for smaller machines, large machines with large numbers of CPUs and large numbers of filesystems/block devices may benefit from larger numbers of threads working on different block devices. Even if the background flushing algorithm is radically changed, it is still likely that multiple threads will be involved and admins would still desire finer control on the min/max other than to have to recompile the kernel. The patch adds '/proc/sys/vm/nr_pdflush_threads_min' and '/proc/sys/vm/nr_pdflush_threads_max' with r/w permissions. The minimum value for nr_pdflush_threads_min is 1 and the maximum value is the current value of nr_pdflush_threads_max. This minimum is required since additional thread creation is performed in a pdflush thread itself. The minimum value for nr_pdflush_threads_max is the current value of nr_pdflush_threads_min and the maximum value can be 1000. Documentation/sysctl/vm.txt is also updated. [akpm@linux-foundation.org: fix comment, fix whitespace, use __read_mostly] Signed-off-by: NPeter W Morreale <pmorreale@novell.com> Reviewed-by: NRik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Peter W Morreale 提交于
Fix a race on creating pdflush threads. Without the patch, it is possible to create more than MAX_PDFLUSH_THREADS threads, and this has been observed in practice on IO loaded SMP machines. The fix involves moving the lock around to protect the check against the thread count and correctly dealing with thread creation failure. This fix also _mostly_ repairs a race condition on how quickly the threads are created. The original intent was to create a pdflush thread (up to the max allowed) every second. Without this patch is is possible to create NCPUS pdflush threads concurrently. The 'mostly' caveat is because an assumption is made that thread creation will be successful. If we fail to create the thread, the miss is not considered fatal. (we will try again in 1 second) Signed-off-by: NPeter W Morreale <pmorreale@novell.com> Reviewed-by: NRik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Stephen Rothwell 提交于
This eliminates a compiler warning: mm/allocpercpu.c: In function 'free_percpu': mm/allocpercpu.c:146: warning: passing argument 2 of '__percpu_depopulate_mask' discards qualifiers from pointer target type Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 06 4月, 2009 1 次提交
-
-
由 Jens Axboe 提交于
This makes sure that we never wait on async IO for sync requests, instead of doing the split on writes vs reads. Signed-off-by: NJens Axboe <jens.axboe@oracle.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 04 4月, 2009 1 次提交
-
-
由 Evgeniy Polyakov 提交于
This patch adds Kconfig and Makefile entries and exports to VFS functions to be used by POHMELFS. Signed-off-by: NEvgeniy Polyakov <zbr@ioremap.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
-
- 03 4月, 2009 9 次提交
-
-
由 David Howells 提交于
Add a function to install a monitor on the page lock waitqueue for a particular page, thus allowing the page being unlocked to be detected. This is used by CacheFiles to detect read completion on a page in the backing filesystem so that it can then copy the data to the waiting netfs page. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NRik van Riel <riel@redhat.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
-
由 David Howells 提交于
Recruit a page flag to aid in cache management. The following extra flag is defined: (1) PG_fscache (PG_private_2) The marked page is backed by a local cache and is pinning resources in the cache driver. If PG_fscache is set, then things that checked for PG_private will now also check for that. This includes things like truncation and page invalidation. The function page_has_private() had been added to make the checks for both PG_private and PG_private_2 at the same time. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NRik van Riel <riel@redhat.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
-
由 David Howells 提交于
The attached patch causes read_cache_pages() to release page-private data on a page for which add_to_page_cache() fails. If the filler function fails, then the problematic page is left attached to the pagecache (with appropriate flags set, one presumes) and the remaining to-be-attached pages are invalidated and discarded. This permits pages with caching references associated with them to be cleaned up. The invalidatepage() address space op is called (indirectly) to do the honours. Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NSteve Dickson <steved@redhat.com> Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NRik van Riel <riel@redhat.com> Acked-by: NAl Viro <viro@zeniv.linux.org.uk> Tested-by: NDaire Byrne <Daire.Byrne@framestore.com>
-
由 Pekka Enberg 提交于
Impact: also output kfree(NULL) entries This patch moves the trace_kfree() calls before the ZERO_OR_NULL_PTR check so that we can trace call-sites that call kfree() with NULL many times which might be an indication of a bug. Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> LKML-Reference: <1237971957.30175.18.camel@penberg-laptop> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
kmemtrace now uses tracepoints instead of markers. We no longer need to use format specifiers to pass arguments. Signed-off-by: NEduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> [ folded: Use the new TP_PROTO and TP_ARGS to fix the build. ] [ folded: fix build when CONFIG_KMEMTRACE is disabled. ] [ folded: define tracepoints when CONFIG_TRACEPOINTS is enabled. ] Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <ae61c0f37156db8ec8dc0d5778018edde60a92e3.1237813499.git.eduard.munteanu@linux360.ro> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Pekka Enberg 提交于
Impact: cleanup mm/failslab.c depends on slab.h without including it: CC mm/failslab.o mm/failslab.c: In function ‘should_failslab’: mm/failslab.c:16: error: ‘__GFP_NOFAIL’ undeclared (first use in this function) mm/failslab.c:16: error: (Each undeclared identifier is reported only once mm/failslab.c:16: error: for each function it appears in.) mm/failslab.c:19: error: ‘__GFP_WAIT’ undeclared (first use in this function) make[1]: *** [mm/failslab.o] Error 1 make: *** [mm] Error 2 It gets included implicitly currently - but this will not be the case with upcoming kmemtrace changes. Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> LKML-Reference: <1237888761.25315.69.camel@penberg-laptop> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
由 Daisuke Nishimura 提交于
Current mem_cgroup_cache_charge is a bit complicated especially in the case of shmem's swap-in. This patch cleans it up by using try_charge_swapin and commit_charge_swapin. Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KAMEZAWA Hiroyuki 提交于
It's pointed out that swap_cgroup's message at swapon() is nonsense. Because * It can be calculated very easily if all necessary information is written in Kconfig. * It's not necessary to annoying people at every swapon(). In other view, now, memory usage per swp_entry is reduced to 2bytes from 8bytes(64bit) and I think it's reasonably small. Reported-by: NHugh Dickins <hugh@veritas.com> Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 KAMEZAWA Hiroyuki 提交于
Try to use CSS ID for records in swap_cgroup. By this, on 64bit machine, size of swap_cgroup goes down to 2 bytes from 8bytes. This means, when 2GB of swap is equipped, (assume the page size is 4096bytes) From size of swap_cgroup = 2G/4k * 8 = 4Mbytes. To size of swap_cgroup = 2G/4k * 2 = 1Mbytes. Reduction is large. Of course, there are trade-offs. This CSS ID will add overhead to swap-in/swap-out/swap-free. But in general, - swap is a resource which the user tend to avoid use. - If swap is never used, swap_cgroup area is not used. - Reading traditional manuals, size of swap should be proportional to size of memory. Memory size of machine is increasing now. I think reducing size of swap_cgroup makes sense. Note: - ID->CSS lookup routine has no locks, it's under RCU-Read-Side. - memcg can be obsolete at rmdir() but not freed while refcnt from swap_cgroup is available. Changelog v4->v5: - reworked on to memcg-charge-swapcache-to-proper-memcg.patch Changlog ->v4: - fixed not configured case. - deleted unnecessary comments. - fixed NULL pointer bug. - fixed message in dmesg. [nishimura@mxp.nes.nec.co.jp: css_tryget can be called twice in !PageCgroupUsed case] Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-