- 02 2月, 2006 12 次提交
-
-
由 Christoph Lameter 提交于
Migrate a page with buffers without requiring writeback This introduces a new address space operation migratepage() that may be used by a filesystem to implement its own version of page migration. A version is provided that migrates buffers attached to pages. Some filesystems (ext2, ext3, xfs) are modified to utilize this feature. The swapper address space operation are modified so that a regular migrate_page() will occur for anonymous pages without writeback (migrate_pages forces every anonymous page to have a swap entry). Signed-off-by: NMike Kravetz <kravetz@us.ibm.com> Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Add remove_from_swap remove_from_swap() allows the restoration of the pte entries that existed before page migration occurred for anonymous pages by walking the reverse maps. This reduces swap use and establishes regular pte's without the need for page faults. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Add direct migration support with fall back to swap. Direct migration support on top of the swap based page migration facility. This allows the direct migration of anonymous pages and the migration of file backed pages by dropping the associated buffers (requires writeout). Fall back to swap out if necessary. The patch is based on lots of patches from the hotplug project but the code was restructured, documented and simplified as much as possible. Note that an additional patch that defines the migrate_page() method for filesystems is necessary in order to avoid writeback for anonymous and file backed pages. Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: NMike Kravetz <kravetz@us.ibm.com> Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
If large amounts of zone memory are used by empty slabs then zone_reclaim becomes uneffective. This patch shakes the slab a bit. The problem with this patch is that the slab reclaim is not containable to a zone. Thus slab reclaim may affect the whole system and be extremely slow. This also means that we cannot determine how many pages were freed in this zone. Thus we need to go off node for at least one allocation. The functionality is disabled by default. We could modify the shrinkers to take a zone parameter but that would be quite invasive. Better ideas are welcome. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
In some situations one may want zone_reclaim to behave differently. For example a process writing large amounts of memory will spew unto other nodes to cache the writes if many pages in a zone become dirty. This may impact the performance of processes running on other nodes. Allowing writes during reclaim puts a stop to that behavior and throttles the process by restricting the pages to the local zone. Similarly one may want to contain processes to local memory by enabling regular swap behavior during zone_reclaim. Off node memory allocation can then be controlled through memory policies and cpusets. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Currently the zone_reclaim code has a fixed window of 30 seconds of off node allocations should a local zone have no unused pagecache pages left. Reclaim will be attempted again after this timeout period to avoid repeated useless scans for memory. This is also useful to established sufficiently large off node allocation chunks to relieve the local node. It may be beneficial to adjust that time period for some special situations. For example if memory use was exceeding node capacity one may want to give up for longer periods of time. If memory spikes intermittendly then one may want to shorten the time period to reduce the number of off node allocations. This patch allows just that.... Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Instead of scanning all the pages in a zone, imitate real swap and scan only a portion of the pages and gradually scan more if we do not free up enough pages. This avoids a zone suddenly loosing all unused pagecache pages (we may after all access some of these again so they deserve another chance) but it still frees up large chunks of memory if a zone only contains unused pagecache pages. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
zone_reclaim should leave that to the real swapper. We are only interested in evicting unmapped pages. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
- If we only reclaim nr_pages then its okay to stay on node. Switch from > to >= for the comparison. - vm_table[] entry for zone_reclaim_mode is a bit screwed up. - Add empty lines around shrink_zone to show that this is the central function to be called. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Make sc->may_writepage control the writeout behavior of shrink_list. Remove the laptop_mode trick from shrink_list and instead set may_writepage in try_to_free_pages properly. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Zone reclaim is usually only run on the local node. Headless nodes do not have any local processors. This patch checks for headless nodes and performs zone reclaim on them. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndy Whitcroft <apw@shadowen.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Ensure that the performance of off node pages stays the same as before. Off node pagefault tests showed an 18% drop in performance without this patch. - Increase the timeout to 30 seconds to reduce the overhead. - Move all code possible out of the off node hot path for zone reclaim (Sorry Andrew, the struct initialization had to be sacrificed). The read_page_state() bit us there. - Check first for the timeout before any other checks. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 19 1月, 2006 3 次提交
-
-
由 Christoph Lameter 提交于
Some bits for zone reclaim exists in 2.6.15 but they are not usable. This patch fixes them up, removes unused code and makes zone reclaim usable. Zone reclaim allows the reclaiming of pages from a zone if the number of free pages falls below the watermarks even if other zones still have enough pages available. Zone reclaim is of particular importance for NUMA machines. It can be more beneficial to reclaim a page than taking the performance penalties that come with allocating a page on a remote zone. Zone reclaim is enabled if the maximum distance to another node is higher than RECLAIM_DISTANCE, which may be defined by an arch. By default RECLAIM_DISTANCE is 20. 20 is the distance to another node in the same component (enclosure or motherboard) on IA64. The meaning of the NUMA distance information seems to vary by arch. If zone reclaim is not successful then no further reclaim attempts will occur for a certain time period (ZONE_RECLAIM_INTERVAL). This patch was discussed before. See http://marc.theaimsgroup.com/?l=linux-kernel&m=113519961504207&w=2 http://marc.theaimsgroup.com/?l=linux-kernel&m=113408418232531&w=2 http://marc.theaimsgroup.com/?l=linux-kernel&m=113389027420032&w=2 http://marc.theaimsgroup.com/?l=linux-kernel&m=113380938612205&w=2Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Zone reclaim has a huge impact on NUMA performance (f.e. our maximum throughput with XFS is raised from 4GB to 6GB/sec / page cache contamination of numa nodes destroys locality if one just does a large copy operation which results in performance dropping for good until reboot). This patch: Resurrect may_swap in struct scan_control Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Nick Piggin 提交于
Migration code currently does not take a reference to target page properly, so between unlocking the pte and trying to take a new reference to the page with isolate_lru_page, anything could happen to it. Fix this by holding the pte lock until we get a chance to elevate the refcount. Other small cleanups while we're here. Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 09 1月, 2006 10 次提交
-
-
由 Christoph Lameter 提交于
Use -Exxx instead of numeric return codes and cleanup the code in migrate_pages() using -Exx error codes. Consolidate successful migration handling Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Extend the parameters of migrate_pages() to allow the caller control over the fate of successfully migrated or impossible to migrate pages. Swap migration and direct migration will have the same interface after this patch so that patches can be independently applied to the policy layer and the core migration code. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Drop unused pages immediately If a page is encountered that is only referenced by the migration code then there is no reason to swap or migrate the page. Release the page by calling move_to_lru(). Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Add gfp_mask to add_to_swap add_to_swap does allocations with GFP_ATOMIC in order not to interfere with swapping. During migration we may have use add_to_swap extensively which may lead to out of memory errors. This patch makes add_to_swap take a parameter that specifies the gfp mask. The page migration code can then make add_to_swap use GFP_KERNEL. Signed-off-by: NHirokazu Takahashi <taka@valinux.co.jp> Signed-off-by: NDave Hansen <haveblue@us.ibm.com> Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Move move_to_lru, putback_lru_pages and isolate_lru in section surrounded by CONFIG_MIGRATION saving some codesize for single processor kernels. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Include page migration if the system is NUMA or having a memory model that allows distinct areas of memory (SPARSEMEM, DISCONTIGMEM). And: - Only include lru_add_drain_per_cpu if building for an SMP system. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
This adds the basic page migration function with a minimal implementation that only allows the eviction of pages to swap space. Page eviction and migration may be useful to migrate pages, to suspend programs or for remapping single pages (useful for faulty pages or pages with soft ECC failures) The process is as follows: The function wanting to migrate pages must first build a list of pages to be migrated or evicted and take them off the lru lists via isolate_lru_page(). isolate_lru_page determines that a page is freeable based on the LRU bit set. Then the actual migration or swapout can happen by calling migrate_pages(). migrate_pages does its best to migrate or swapout the pages and does multiple passes over the list. Some pages may only be swappable if they are not dirty. migrate_pages may start writing out dirty pages in the initial passes over the pages. However, migrate_pages may not be able to migrate or evict all pages for a variety of reasons. The remaining pages may be returned to the LRU lists using putback_lru_pages(). Changelog V4->V5: - Use the lru caches to return pages to the LRU Changelog V3->V4: - Restructure code so that applying patches to support full migration does require minimal changes. Rename swapout_pages() to migrate_pages(). Changelog V2->V3: - Extract common code from shrink_list() and swapout_pages() Signed-off-by: NMike Kravetz <kravetz@us.ibm.com> Signed-off-by: NChristoph Lameter <clameter@sgi.com> Cc: "Michael Kerrisk" <mtk-manpages@gmx.net> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
Add PF_SWAPWRITE to control a processes permission to write to swap. - Use PF_SWAPWRITE in may_write_to_queue() instead of checking for kswapd and pdflush - Set PF_SWAPWRITE flag for kswapd and pdflush Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Lameter 提交于
This is the start of the `swap migration' patch series. Swap migration allows the moving of the physical location of pages between nodes in a numa system while the process is running. This means that the virtual addresses that the process sees do not change. However, the system rearranges the physical location of those pages. The main intent of page migration patches here is to reduce the latency of memory access by moving pages near to the processor where the process accessing that memory is running. The patchset allows a process to manually relocate the node on which its pages are located through the MF_MOVE and MF_MOVE_ALL options while setting a new memory policy. The pages of process can also be relocated from another process using the sys_migrate_pages() function call. Requires CAP_SYS_ADMIN. The migrate_pages function call takes two sets of nodes and moves pages of a process that are located on the from nodes to the destination nodes. Manual migration is very useful if for example the scheduler has relocated a process to a processor on a distant node. A batch scheduler or an administrator can detect the situation and move the pages of the process nearer to the new processor. sys_migrate_pages() could be used on non-numa machines as well, to force all of a particualr process's pages out to swap, if someone thinks that's useful. Larger installations usually partition the system using cpusets into sections of nodes. Paul has equipped cpusets with the ability to move pages when a task is moved to another cpuset. This allows automatic control over locality of a process. If a task is moved to a new cpuset then also all its pages are moved with it so that the performance of the process does not sink dramatically (as is the case today). Swap migration works by simply evicting the page. The pages must be faulted back in. The pages are then typically reallocated by the system near the node where the process is executing. For swap migration the destination of the move is controlled by the allocation policy. Cpusets set the allocation policy before calling sys_migrate_pages() in order to move the pages as intended. No allocation policy changes are performed for sys_migrate_pages(). This means that the pages may not faulted in to the specified nodes if no allocation policy was set by other means. The pages will just end up near the node where the fault occurred. There's another patch series in the pipeline which implements "direct migration". The direct migration patchset extends the migration functionality to avoid going through swap. The destination node of the relation is controllable during the actual moving of pages. The crutch of using the allocation policy to relocate is not necessary and the pages are moved directly to the target. Its also faster since swap is not used. And sys_migrate_pages() can then move pages directly to the specified node. Implement functions to isolate pages from the LRU and put them back later. This patch: An earlier implementation was provided by Hirokazu Takahashi <taka@valinux.co.jp> and IWAMOTO Toshihiro <iwamoto@valinux.co.jp> for the memory hotplug project. From: Magnus This breaks out isolate_lru_page() and putpack_lru_page(). Needed for swap migration. Signed-off-by: NMagnus Damm <magnus.damm@gmail.com> Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Andrew Morton 提交于
Add /proc/sys/vm/drop_caches. When written to, this will cause the kernel to discard as much pagecache and/or reclaimable slab objects as it can. THis operation requires root permissions. It won't drop dirty data, so the user should run `sync' first. Caveats: a) Holds inode_lock for exorbitant amounts of time. b) Needs to be taught about NUMA nodes: propagate these all the way through so the discarding can be controlled on a per-node basis. This is a debugging feature: useful for getting consistent results between filesystem benchmarks. We could possibly put it under a config option, but it's less than 300 bytes. Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 07 1月, 2006 4 次提交
-
-
由 Nick Piggin 提交于
Optimise page_state manipulations by introducing interrupt unsafe accessors to page_state fields. Callers must provide their own locking (either disable interrupts or not update from interrupt context). Switch over the hot callsites that can easily be moved under interrupts off sections. Signed-off-by: NNick Piggin <npiggin@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Con Kolivas 提交于
There are numerous places we check whether a zone is populated or not. Provide a helper function to check for populated zones and convert all checks for zone->present_pages. Signed-off-by: NCon Kolivas <kernel@kolivas.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Andrew Morton 提交于
Revert a patch which went into 2.6.8-rc1. The changelog for that patch was: The shrink_zone() logic can, under some circumstances, cause far too many pages to be reclaimed. Say, we're scanning at high priority and suddenly hit a large number of reclaimable pages on the LRU. Change things so we bale out when SWAP_CLUSTER_MAX pages have been reclaimed. Problem is, this change caused significant imbalance in inter-zone scan balancing by truncating scans of larger zones. Suppose, for example, ZONE_HIGHMEM is 10x the size of ZONE_NORMAL. The zone balancing algorithm would require that if we're scanning 100 pages of ZONE_HIGHMEM, we should scan 10 pages of ZONE_NORMAL. But this logic will cause the scanning of ZONE_HIGHMEM to bale out after only 32 pages are reclaimed. Thus effectively causing smaller zones to be scanned relatively harder than large ones. Now I need to remember what the workload was which caused me to write this patch originally, then fix it up in a different way... Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Andrew Morton 提交于
Remove the last bits of Martin's ill-fated sys_set_zone_reclaim(). Cc: Martin Hicks <mort@wildopensource.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 04 1月, 2006 1 次提交
-
-
由 Zach Brown 提交于
readpage(), prepare_write(), and commit_write() callers are updated to understand the special return code AOP_TRUNCATED_PAGE in the style of writepage() and WRITEPAGE_ACTIVATE. AOP_TRUNCATED_PAGE tells the caller that the callee has unlocked the page and that the operation should be tried again with a new page. OCFS2 uses this to detect and work around a lock inversion in its aop methods. There should be no change in behaviour for methods that don't return AOP_TRUNCATED_PAGE. WRITEPAGE_ACTIVATE is also prepended with AOP_ for consistency and they are made enums so that kerneldoc can be used to document their semantics. Signed-off-by: NZach Brown <zach.brown@oracle.com>
-
- 29 11月, 2005 2 次提交
-
-
由 Andrea Arcangeli 提交于
With Andrew Morton <akpm@osdl.org> The slab scanning code tries to balance the scanning rate of slabs versus the scanning rate of LRU pages. To do this, it retains state concerning how many slabs have been scanned - if a particular slab shrinker didn't scan enough objects, we remember that for next time, and scan more objects on the next pass. The problem with this is that with (say) a huge number of GFP_NOIO direct-reclaim attempts, the number of objects which are to be scanned when we finally get a GFP_KERNEL request can be huge. Because some shrinker handlers just bail out if !__GFP_FS. So the patch clamps the number of objects-to-be-scanned to 2* the total number of objects in the slab cache. Signed-off-by: NAndrea Arcangeli <andrea@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Rik van Riel 提交于
Some users (hi Zwane) have seen a problem when running a workload that eats nearly all of physical memory - th system does an OOM kill, even when there is still a lot of swap free. The problem appears to be a very big task that is holding the swap token, and the VM has a very hard time finding any other page in the system that is swappable. Instead of ignoring the swap token when sc->priority reaches 0, we could simply take the swap token away from the memory hog and make sure we don't give it back to the memory hog for a few seconds. This patch resolves the problem Zwane ran into. Signed-off-by: NRik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 14 11月, 2005 1 次提交
-
-
由 Rohit Seth 提交于
Clean up of __alloc_pages. Restoration of previous behaviour, plus further cleanups by introducing an 'alloc_flags', removing the last of should_reclaim_zone. Signed-off-by: NRohit Seth <rohit.seth@intel.com> Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 30 10月, 2005 2 次提交
-
-
由 Hugh Dickins 提交于
Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with a many-threaded application which concurrently initializes different parts of a large anonymous area. This patch corrects that, by using a separate spinlock per page table page, to guard the page table entries in that page, instead of using the mm's single page_table_lock. (But even then, page_table_lock is still used to guard page table allocation, and anon_vma allocation.) In this implementation, the spinlock is tucked inside the struct page of the page table page: with a BUILD_BUG_ON in case it overflows - which it would in the case of 32-bit PA-RISC with spinlock debugging enabled. Splitting the lock is not quite for free: another cacheline access. Ideally, I suppose we would use split ptlock only for multi-threaded processes on multi-cpu machines; but deciding that dynamically would have its own costs. So for now enable it by config, at some number of cpus - since the Kconfig language doesn't support inequalities, let preprocessor compare that with NR_CPUS. But I don't think it's worth being user-configurable: for good testing of both split and unsplit configs, split now at 4 cpus, and perhaps change that to 8 later. There is a benefit even for singly threaded processes: kswapd can be attacking one part of the mm while another part is busy faulting. Signed-off-by: NHugh Dickins <hugh@veritas.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Lee Schermerhorn 提交于
Martin Hicks' page cache reclaim patch added the 'may_swap' flag to the scan_control struct; and modified shrink_list() not to add anon pages to the swap cache if may_swap is not asserted. Ref: http://marc.theaimsgroup.com/?l=linux-mm&m=111461480725322&w=4 However, further down, if the page is mapped, shrink_list() calls try_to_unmap() which will call try_to_unmap_one() via try_to_unmap_anon (). try_to_unmap_one() will BUG_ON() an anon page that is NOT in the swap cache. Martin says he never encountered this path in his testing, but agrees that it might happen. This patch modifies shrink_list() to skip anon pages that are not already in the swap cache when !may_swap, rather than just not adding them to the cache. Signed-off-by: NLee Schermerhorn <lee.schermerhorn@hp.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 28 10月, 2005 1 次提交
-
-
由 Al Viro 提交于
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 17 10月, 2005 1 次提交
-
-
由 Linus Torvalds 提交于
As noticed by Nick Piggin, we need to make sure that we check the page count before we check for PageDirty, since the dirty check is only valid if the count implies that we're the only possible ones holding the page. We always did do this, but the code needs a read-memory-barrier to make sure that the orderign is also honored by the CPU. (The writer side is ordered due to the atomic decrement and test on the page count, see the discussion on linux-kernel) Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 13 9月, 2005 1 次提交
-
-
由 Con Kolivas 提交于
Use the pgdat pointer we've already defined in wakeup_kswapd Signed-off-by: NCon Kolivas <kernel@kolivas.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 08 9月, 2005 1 次提交
-
-
由 Paul Jackson 提交于
This patch makes use of the previously underutilized cpuset flag 'mem_exclusive' to provide what amounts to another layer of memory placement resolution. With this patch, there are now the following four layers of memory placement available: 1) The whole system (interrupt and GFP_ATOMIC allocations can use this), 2) The nearest enclosing mem_exclusive cpuset (GFP_KERNEL allocations can use), 3) The current tasks cpuset (GFP_USER allocations constrained to here), and 4) Specific node placement, using mbind and set_mempolicy. These nest - each layer is a subset (same or within) of the previous. Layer (2) above is new, with this patch. The call used to check whether a zone (its node, actually) is in a cpuset (in its mems_allowed, actually) is extended to take a gfp_mask argument, and its logic is extended, in the case that __GFP_HARDWALL is not set in the flag bits, to look up the cpuset hierarchy for the nearest enclosing mem_exclusive cpuset, to determine if placement is allowed. The definition of GFP_USER, which used to be identical to GFP_KERNEL, is changed to also set the __GFP_HARDWALL bit, in the previous cpuset_gfp_hardwall_flag patch. GFP_ATOMIC and GFP_KERNEL allocations will stay within the current tasks cpuset, so long as any node therein is not too tight on memory, but will escape to the larger layer, if need be. The intended use is to allow something like a batch manager to handle several jobs, each job in its own cpuset, but using common kernel memory for caches and such. Swapper and oom_kill activity is also constrained to Layer (2). A task in or below one mem_exclusive cpuset should not cause swapping on nodes in another non-overlapping mem_exclusive cpuset, nor provoke oom_killing of a task in another such cpuset. Heavy use of kernel memory for i/o caching and such by one job should not impact the memory available to jobs in other non-overlapping mem_exclusive cpusets. This patch enables providing hardwall, inescapable cpusets for memory allocations of each job, while sharing kernel memory allocations between several jobs, in an enclosing mem_exclusive cpuset. Like Dinakar's patch earlier to enable administering sched domains using the cpu_exclusive flag, this patch also provides a useful meaning to a cpuset flag that had previously done nothing much useful other than restrict what cpuset configurations were allowed. Signed-off-by: NPaul Jackson <pj@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 05 9月, 2005 1 次提交
-
-
由 Martin Hicks 提交于
Christoph Lameter and Marcelo Tosatti asked to get rid of the atomic_inc_and_test() to cleanup the atomic ops in the zone reclaim code. Signed-off-by: NMartin Hicks <mort@sgi.com> Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-