- 18 7月, 2007 10 次提交
-
-
由 Christoph Lameter 提交于
Use list_for_each_entry() instead of list_for_each(). Get rid of for_all_slabs(). It had only one user. So fold it into the callback. This also gets rid of cpu_slab_flush. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
Changes the error reporting format to loosely follow lockdep. If data corruption is detected then we generate the following lines: ============================================ BUG <slab-cache>: <problem> -------------------------------------------- INFO: <more information> [possibly multiple times] <object dump> FIX <slab-cache>: <remedial action> This also adds some more intelligence to the data corruption detection. Its now capable of figuring out the start and end. Add a comment on how to configure SLUB so that a production system may continue to operate even though occasional slab corruption occur through a misbehaving kernel component. See "Emergency operations" in Documentation/vm/slub.txt. [akpm@linux-foundation.org: build fix] Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rusty Russell 提交于
I can never remember what the function to register to receive VM pressure is called. I have to trace down from __alloc_pages() to find it. It's called "set_shrinker()", and it needs Your Help. 1) Don't hide struct shrinker. It contains no magic. 2) Don't allocate "struct shrinker". It's not helpful. 3) Call them "register_shrinker" and "unregister_shrinker". 4) Call the function "shrink" not "shrinker". 5) Reduce the 17 lines of waffly comments to 13, but document it properly. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: David Chinner <dgc@sgi.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andy Whitcroft 提交于
When we are out of memory of a suitable size we enter reclaim. The current reclaim algorithm targets pages in LRU order, which is great for fairness at order-0 but highly unsuitable if you desire pages at higher orders. To get pages of higher order we must shoot down a very high proportion of memory; >95% in a lot of cases. This patch set adds a lumpy reclaim algorithm to the allocator. It targets groups of pages at the specified order anchored at the end of the active and inactive lists. This encourages groups of pages at the requested orders to move from active to inactive, and active to free lists. This behaviour is only triggered out of direct reclaim when higher order pages have been requested. This patch set is particularly effective when utilised with an anti-fragmentation scheme which groups pages of similar reclaimability together. This patch set is based on Peter Zijlstra's lumpy reclaim V2 patch which forms the foundation. Credit to Mel Gorman for sanitity checking. Mel said: The patches have an application with hugepage pool resizing. When lumpy-reclaim is used used with ZONE_MOVABLE, the hugepages pool can be resized with greater reliability. Testing on a desktop machine with 2GB of RAM showed that growing the hugepage pool with ZONE_MOVABLE on it's own was very slow as the success rate was quite low. Without lumpy-reclaim, each attempt to grow the pool by 100 pages would yield 1 or 2 hugepages. With lumpy-reclaim, getting 40 to 70 hugepages on each attempt was typical. [akpm@osdl.org: ia64 pfn_to_nid fixes and loop cleanup] [bunk@stusta.de: static declarations for internal functions] [a.p.zijlstra@chello.nl: initial lumpy V2 implementation] Signed-off-by: NAndy Whitcroft <apw@shadowen.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NMel Gorman <mel@csn.ul.ie> Acked-by: NMel Gorman <mel@csn.ul.ie> Cc: Bob Picco <bob.picco@hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
This patch adds a new parameter for sizing ZONE_MOVABLE called movablecore=. While kernelcore= is used to specify the minimum amount of memory that must be available for all allocation types, movablecore= is used to specify the minimum amount of memory that is used for migratable allocations. The amount of memory used for migratable allocations determines how large the huge page pool could be dynamically resized to at runtime for example. How movablecore is actually handled is that the total number of pages in the system is calculated and a value is set for kernelcore that is kernelcore == totalpages - movablecore Both kernelcore= and movablecore= can be safely specified at the same time. Signed-off-by: NMel Gorman <mel@csn.ul.ie> Acked-by: NAndy Whitcroft <apw@shadowen.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
This patch adds the kernelcore= parameter for x86. Once all patches are applied, a new command-line parameter exist and a new sysctl. This patch adds the necessary documentation. From: Yasunori Goto <y-goto@jp.fujitsu.com> When "kernelcore" boot option is specified, kernel can't boot up on ia64 because of an infinite loop. In addition, the parsing code can be handled in an architecture-independent manner. This patch uses common code to handle the kernelcore= parameter. It is only available to architectures that support arch-independent zone-sizing (i.e. define CONFIG_ARCH_POPULATES_NODE_MAP). Other architectures will ignore the boot parameter. [bunk@stusta.de: make cmdline_parse_kernelcore() static] Signed-off-by: NMel Gorman <mel@csn.ul.ie> Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com> Acked-by: NAndy Whitcroft <apw@shadowen.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
Huge pages are not movable so are not allocated from ZONE_MOVABLE. However, as ZONE_MOVABLE will always have pages that can be migrated or reclaimed, it can be used to satisfy hugepage allocations even when the system has been running a long time. This allows an administrator to resize the hugepage pool at runtime depending on the size of ZONE_MOVABLE. This patch adds a new sysctl called hugepages_treat_as_movable. When a non-zero value is written to it, future allocations for the huge page pool will use ZONE_MOVABLE. Despite huge pages being non-movable, we do not introduce additional external fragmentation of note as huge pages are always the largest contiguous block we care about. [akpm@linux-foundation.org: various fixes] Signed-off-by: NMel Gorman <mel@csn.ul.ie> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
The following 8 patches against 2.6.20-mm2 create a zone called ZONE_MOVABLE that is only usable by allocations that specify both __GFP_HIGHMEM and __GFP_MOVABLE. This has the effect of keeping all non-movable pages within a single memory partition while allowing movable allocations to be satisfied from either partition. The patches may be applied with the list-based anti-fragmentation patches that groups pages together based on mobility. The size of the zone is determined by a kernelcore= parameter specified at boot-time. This specifies how much memory is usable by non-movable allocations and the remainder is used for ZONE_MOVABLE. Any range of pages within ZONE_MOVABLE can be released by migrating the pages or by reclaiming. When selecting a zone to take pages from for ZONE_MOVABLE, there are two things to consider. First, only memory from the highest populated zone is used for ZONE_MOVABLE. On the x86, this is probably going to be ZONE_HIGHMEM but it would be ZONE_DMA on ppc64 or possibly ZONE_DMA32 on x86_64. Second, the amount of memory usable by the kernel will be spread evenly throughout NUMA nodes where possible. If the nodes are not of equal size, the amount of memory usable by the kernel on some nodes may be greater than others. By default, the zone is not as useful for hugetlb allocations because they are pinned and non-migratable (currently at least). A sysctl is provided that allows huge pages to be allocated from that zone. This means that the huge page pool can be resized to the size of ZONE_MOVABLE during the lifetime of the system assuming that pages are not mlocked. Despite huge pages being non-movable, we do not introduce additional external fragmentation of note as huge pages are always the largest contiguous block we care about. Credit goes to Andy Whitcroft for catching a large variety of problems during review of the patches. This patch creates an additional zone, ZONE_MOVABLE. This zone is only usable by allocations which specify both __GFP_HIGHMEM and __GFP_MOVABLE. Hot-added memory continues to be placed in their existing destination as there is no mechanism to redirect them to a specific zone. [y-goto@jp.fujitsu.com: Fix section mismatch of memory hotplug related code] [akpm@linux-foundation.org: various fixes] Signed-off-by: NMel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: NYasunori Goto <y-goto@jp.fujitsu.com> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mel Gorman 提交于
It is often known at allocation time whether a page may be migrated or not. This patch adds a flag called __GFP_MOVABLE and a new mask called GFP_HIGH_MOVABLE. Allocations using the __GFP_MOVABLE can be either migrated using the page migration mechanism or reclaimed by syncing with backing storage and discarding. An API function very similar to alloc_zeroed_user_highpage() is added for __GFP_MOVABLE allocations called alloc_zeroed_user_highpage_movable(). The flags used by alloc_zeroed_user_highpage() are not changed because it would change the semantics of an existing API. After this patch is applied there are no in-kernel users of alloc_zeroed_user_highpage() so it probably should be marked deprecated if this patch is merged. Note that this patch includes a minor cleanup to the use of __GFP_ZERO in shmem.c to keep all flag modifications to inode->mapping in the shmem_dir_alloc() helper function. This clean-up suggestion is courtesy of Hugh Dickens. Additional credit goes to Christoph Lameter and Linus Torvalds for shaping the concept. Credit to Hugh Dickens for catching issues with shmem swap vector and ramfs allocations. [akpm@linux-foundation.org: build fix] [hugh@veritas.com: __GFP_ZERO cleanup] Signed-off-by: NMel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 NeilBrown 提交于
do_generic_mapping_read currently samples the i_size at the start and doesn't do so again unless it needs to call ->readpage to load a page. After ->readpage it has to re-sample i_size as a truncate may have caused that page to be filled with zeros, and the read() call should not see these. However there are other activities that might cause ->readpage to be called on a page between the time that do_generic_mapping_read samples i_size and when it finds that it has an uptodate page. These include at least read-ahead and possibly another thread performing a read. So do_generic_mapping_read must sample i_size *after* it has an uptodate page. Thus the current sampling at the start and after a read can be replaced with a sampling before the copy-out. The same change applied to __generic_file_splice_read. Note that this fixes any race with truncate_complete_page, but does not fix a possible race with truncate_partial_page. If a partial truncate happens after do_generic_mapping_read samples i_size and before the copy_out, the nuls that truncate_partial_page place in the page could be copied out incorrectly. I think the best fix for that is to *not* zero out parts of the page in truncate_partial_page, but rather to zero out the tail of a page when increasing i_size. Signed-off-by: NNeil Brown <neilb@suse.de> Cc: Jens Axboe <jens.axboe@oracle.com> Acked-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 7月, 2007 30 次提交
-
-
由 Rusty Russell 提交于
Christian Borntraeger points out that mempool_free() doesn't noop when handed NULL. This is inconsistent with the other free-like functions in the kernel. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adrian Bunk 提交于
congestion_wait_interruptible() is no longer used. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Acked-by: NTrond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
Repair indenting bustage. Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Akinobu Mita 提交于
Limiting smaller allocation failures by fault injection helps to find real possible bugs. Because higher order allocations are likely to fail and zero-order allocations are not likely to fail. This patch adds min-order parameter to fail_page_alloc. It specifies the minimum page allocation order to be injected failures. Signed-off-by: NAkinobu Mita <akinobu.mita@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Micah Cowan 提交于
Some users have been having problems with utilities like cp or dd dumping core when they try to copy a file that's too large for the destination filesystem (typically, > 4gb). Apparently, some defunct standards required SIGXFSZ to be sent in such circumstances, but SUS only requires/allows it for when a written file exceeds the process's resource limits. I'd like to limit SIGXFSZs to the bare minimum required by SUS. Patch sent per http://lkml.org/lkml/2007/4/10/302Signed-off-by: NMicah Cowan <micahcowan@ubuntu.com> Acked-by: NAlan Cox <alan@redhat.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Stephen Rothwell 提交于
Make some offending drivers depend on it and set CONFIG_ARCH_NO_VIRT_TO_BUS for ppc64 so that we don't build those drivers. This gets PowerPC allmodconfig and allyesconfig much closer to building. Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Cc: Al Viro <viro@ftp.linux.org.uk> Acked-by: NDavid Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Greg Ungerer 提交于
Be consistent with VM mmap, implement expand_stack(). We can't actually do anything other than return an error in the no MMU case though. Signed-off-by: NGreg Ungerer <gerg@uclinux.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
This is a straightforward split of do_mmap_pgoff() into two functions: - do_mmap_pgoff() checks the parameters, and calculates the vma flags. Then it calls - mmap_region(), which does the actual mapping Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
The do_loop_readv_writev implementation of readv breaks out of the loop as soon as a single read request didn't fill it's buffer: if (nr != len) break; The generic_file_aio_read version doesn't. So if it hits EOF before the end of the list of buffers, it will try again on the next buffer. If the file was extended in the mean time, this will produce a bad result. Signed-off-by: NNeil Brown <neilb@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Herbert van den Bergh 提交于
Fix a bug in mm/mlock.c on 32-bit architectures that prevents a user from locking more than 4GB of shared memory, or allocating more than 4GB of shared memory in hugepages, when rlim[RLIMIT_MEMLOCK] is set to RLIM_INFINITY. Signed-off-by: NHerbert van den Bergh <herbert.van.den.bergh@oracle.com> Acked-by: NChris Mason <chris.mason@oracle.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Mundt 提交于
Currently slob is disabled if we're using sparsemem, due to an earlier patch from Goto-san. Slob and static sparsemem work without any trouble as it is, and the only hiccup is a missing slab_is_available() in the case of sparsemem extreme. With this, we're rid of the last set of restrictions for slob usage. Signed-off-by: NPaul Mundt <lethal@linux-sh.org> Acked-by: NPekka Enberg <penberg@cs.helsinki.fi> Acked-by: NMatt Mackall <mpm@selenic.com> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dan Aloni 提交于
Signed-off-by: NDan Aloni <da-x@monatomic.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Mundt 提交于
This adds preliminary NUMA support to SLOB, primarily aimed at systems with small nodes (tested all the way down to a 128kB SRAM block), whether asymmetric or otherwise. We follow the same conventions as SLAB/SLUB, preferring current node placement for new pages, or with explicit placement, if a node has been specified. Presently on UP NUMA this has the side-effect of preferring node#0 allocations (since numa_node_id() == 0, though this could be reworked if we could hand off a pfn to determine node placement), so single-CPU NUMA systems will want to place smaller nodes further out in terms of node id. Once a page has been bound to a node (via explicit node id typing), we only do block allocations from partial free pages that have a matching node id in the page flags. The current implementation does have some scalability problems, in that all partial free pages are tracked in the global freelist (with contention due to the single spinlock). However, these are things that are being reworked for SMP scalability first, while things like per-node freelists can easily be built on top of this sort of functionality once it's been added. More background can be found in: http://marc.info/?l=linux-mm&m=118117916022379&w=2 http://marc.info/?l=linux-mm&m=118170446306199&w=2 http://marc.info/?l=linux-mm&m=118187859420048&w=2 and subsequent threads. Acked-by: NChristoph Lameter <clameter@sgi.com> Acked-by: NMatt Mackall <mpm@selenic.com> Signed-off-by: NPaul Mundt <lethal@linux-sh.org> Acked-by: NNick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jason Baron 提交于
In the new madvise_need_mmap_write() call we can avoid an extra case statement and function call as follows. Signed-off-by: NJason Baron <jbaron@redhat.com> Cc: Nishanth Aravamudan <nacc@us.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adrian Bunk 提交于
start_cpu_timer() should be __cpuinit (which also matches what it's callers are). __devinit didn't cause problems, it simply wasted a few bytes of memory for the common CONFIG_HOTPLUG_CPU=n case. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Mundt 提交于
Currently zone_spanned_pages_in_node() and zone_absent_pages_in_node() are non-static for ARCH_POPULATES_NODE_MAP and static otherwise. However, only the non-static versions are __meminit annotated, despite only being called from __meminit functions in either case. zone_init_free_lists() is currently non-static and not __meminit annotated either, despite only being called once in the entire tree by init_currently_empty_zone(), which too is __meminit. So make it static and properly annotated. Signed-off-by: NPaul Mundt <lethal@linux-sh.org> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Beulich 提交于
This symbol got orphaned quite a while ago. Signed-off-by: NJan Beulich <jbeulich@novell.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Jan Beulich 提交于
.. which modpost started warning about. Signed-off-by: NJan Beulich <jbeulich@novell.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Mundt 提交于
Enabling debugging fails to build due to the nodemask variable in do_mbind() having changed names, and then oopses on boot due to the assumption that the nodemask can be dereferenced -- which doesn't work out so well when the policy is changed to MPOL_DEFAULT with a NULL nodemask by numa_default_policy(). This fixes it up, and switches from PDprintk() to pr_debug() while we're at it. Signed-off-by: NPaul Mundt <lethal@linux-sh.org> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ethan Solomita 提交于
get_user_pages() can try to allocate a nearly unlimited amount of memory on behalf of a user process, even if that process has been OOM killed. The OOM kill occurs upon return to user space via a SIGKILL, but get_user_pages() will try allocate all its memory before returning. Change get_user_pages() to check for TIF_MEMDIE, and if set then return immediately. Signed-off-by: NEthan Solomita <solo@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Paul Mundt 提交于
This converts the default system init memory policy to use a dynamically created node map instead of defaulting to all online nodes. Nodes of a certain size (>= 16MB) are judged to be suitable for interleave, and are added to the map. If all nodes are smaller in size, the largest one is automatically selected. Without this, tiny nodes find themselves out of memory before we even make it to userspace. Systems with large nodes will notice no change. Only the system init policy is effected by this change, the regular MPOL_DEFAULT policy is still switched to later on in the boot process as normal. Signed-off-by: NPaul Mundt <lethal@linux-sh.org> Cc: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
Add a new configuration variable CONFIG_SLUB_DEBUG_ON If set then the kernel will be booted by default with slab debugging switched on. Similar to CONFIG_SLAB_DEBUG. By default slab debugging is available but must be enabled by specifying "slub_debug" as a kernel parameter. Also add support to switch off slab debugging for a kernel that was built with CONFIG_SLUB_DEBUG_ON. This works by specifying slub_debug=- as a kernel parameter. Dave Jones wanted this feature. http://marc.info/?l=linux-kernel&m=118072189913045&w=2 [akpm@linux-foundation.org: clean up switch statement] Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrew Morton 提交于
invalidate_mapping_pages() can sometimes take a long time (millions of pages to free). Long enough for the softlockup detector to trigger. We used to have a cond_resched() in there but I took it out because the drop_caches code calls invalidate_mapping_pages() under inode_lock. The patch adds a nasty flag and puts the cond_resched() back. Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nick Piggin 提交于
Add a bugcheck for Andrea's pagefault vs invalidate race. This is triggerable for both linear and nonlinear pages with a userspace test harness (using direct IO and truncate, respectively). Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Joe Jin 提交于
That static `nid' index needs locking. Without it we can end up calling alloc_pages_node() with an illegal node ID and the kernel crashes. Acked-by: Ngurudas pai <gurudas.pai@oracle.com> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Anderson Briglia 提交于
Fix the shrink_list name on some files under mm/ directory. Signed-off-by: NAnderson Briglia <anderson.briglia@indt.org.br> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nick Piggin 提交于
Remove the core slob allocator's minimum alignment restrictions, and instead introduce the alignment restrictions at the slab API layer. This lets us heed the ARCH_KMALLOC/SLAB_MINALIGN directives, and also use __alignof__ (unsigned long) for the default alignment (which should allow relaxed alignment architectures to take better advantage of SLOB's small minimum alignment). Signed-off-by: NNick Piggin <npiggin@suse.de> Acked-by: NMatt Mackall <mpm@selenic.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nick Piggin 提交于
Remove the bigblock lists in favour of using compound pages and going directly to the page allocator. Allocation size is stored in page->private, which also makes ksize more accurate than it previously was. Saves ~.5K of code, and 12-24 bytes overhead per >= PAGE_SIZE allocation. Signed-off-by: NNick Piggin <npiggin@suse.de> Acked-by: NMatt Mackall <mpm@selenic.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Nick Piggin 提交于
Improve slob by turning the freelist into a list of pages using struct page fields, then each page has a singly linked freelist of slob blocks via a pointer in the struct page. - The first benefit is that the slob freelists can be indexed by a smaller type (2 bytes, if the PAGE_SIZE is reasonable). - Next is that freeing is much quicker because it does not have to traverse the entire freelist. Allocation can be slightly faster too, because we can skip almost-full freelist pages completely. - Slob pages are then freed immediately when they become empty, rather than having a periodic timer try to free them. This gives efficiency and memory consumption improvement. Then, we don't encode seperate size and next fields into each slob block, rather we use the sign bit to distinguish between "size" or "next". Then size 1 blocks contain a "next" offset, and others contain the "size" in the first unit and "next" in the second unit. - This allows minimum slob allocation alignment to go from 8 bytes to 2 bytes on 32-bit and 12 bytes to 2 bytes on 64-bit. In practice, it is best to align them to word size, however some architectures (eg. cris) could gain space savings from turning off this extra alignment. Then, make kmalloc use its own slob_block at the front of the allocation in order to encode allocation size, rather than rely on not overwriting slob's existing header block. - This reduces kmalloc allocation overhead similarly to alignment reductions. - Decouples kmalloc layer from the slob allocator. Then, add a page flag specific to slob pages. - This means kfree of a page aligned slob block doesn't have to traverse the bigblock list. I would get benchmarks, but my test box's network doesn't come up with slob before this patch. I think something is timing out. Anyway, things are faster after the patch. Code size goes up about 1K, however dynamic memory usage _should_ be lower even on relatively small memory systems. Future todo item is to restore the cyclic free list search, rather than to always begin at the start. Signed-off-by: NNick Piggin <npiggin@suse.de> Acked-by: NMatt Mackall <mpm@selenic.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Dumazet 提交于
alloc_large_system_hash() is called at boot time to allocate space for several large hash tables. Lately, TCP hash table was changed and its bucketsize is not a power-of-two anymore. On most setups, alloc_large_system_hash() allocates one big page (order > 0) with __get_free_pages(GFP_ATOMIC, order). This single high_order page has a power-of-two size, bigger than the needed size. We can free all pages that wont be used by the hash table. On a 1GB i386 machine, this patch saves 128 KB of LOWMEM memory. TCP established hash table entries: 32768 (order: 6, 393216 bytes) Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-