- 23 3月, 2006 31 次提交
-
-
由 Arjan van de Ven 提交于
Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: NArjan van de Ven <arjan@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Ingo Molnar 提交于
Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Arjan van de Ven 提交于
Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: NArjan van de Ven <arjan@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Ingo Molnar 提交于
Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: NIngo Molnar <mingo@elte.hu> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Ingo Molnar 提交于
Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: NIngo Molnar <mingo@elte.hu> Cc: John McCutchan <ttb@tentacle.dhs.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Acked-by: NRobert Love <rml@novell.com> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Ingo Molnar 提交于
Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: NIngo Molnar <mingo@elte.hu> Cc: Jan Kara <jack@ucw.cz> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Arjan van de Ven 提交于
Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: NArjan van de Ven <arjan@infradead.org> Signed-off-by: NIngo Molnar <mingo@elte.hu> Acked-by: NJens Axboe <axboe@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Kyle McMartin 提交于
Seems like needless clutter having a bunch of #if defined(CONFIG_$ARCH) in include/linux/cache.h. Move the per architecture section definition to asm/cache.h, and keep the if-not-defined dummy case in linux/cache.h to catch architectures which don't implement the section. Verified that symbols still go in .data.read_mostly on parisc, and the compile doesn't break. Signed-off-by: NKyle McMartin <kyle@parisc-linux.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Christoph Hellwig 提交于
Since early 2.4.x all cdrom drivers implement the block_device methods themselves, so they can handle additional ioctls directly instead of going through the cdrom layer. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NJens Axboe <axboe@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Eric Dumazet 提交于
1) Reduce the size of (struct fdtable) to exactly 64 bytes on 32bits platforms, lowering kmalloc() allocated space by 50%. 2) Reduce the size of (files_struct), using a special 32 bits (or 64bits) embedded_fd_set, instead of a 1024 bits fd_set for the close_on_exec_init and open_fds_init fields. This save some ram (248 bytes per task) as most tasks dont open more than 32 files. D-Cache footprint for such tasks is also reduced to the minimum. 3) Reduce size of allocated fdset. Currently two full pages are allocated, that is 32768 bits on x86 for example, and way too much. The minimum is now L1_CACHE_BYTES. UP and SMP should benefit from this patch, because most tasks will touch only one cache line when open()/close() stdin/stdout/stderr (0/1/2), (next_fd, close_on_exec_init, open_fds_init, fd_array[0 .. 2] being in the same cache line) Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Andrew Morton 提交于
Linus points out that ext3_readdir's readahead only cuts in when ext3_readdir() is operating at the very start of the directory. So for large directories we end up performing no readahead at all and we suck. So take it all out and use the core VM's page_cache_readahead(). This means that ext3 directory reads will use all of readahead's dynamic sizing goop. Note that we're using the directory's filp->f_ra to hold the readahead state, but readahead is actually being performed against the underlying blockdev's address_space. Fortunately the readahead code is all set up to handle this. Tested with printk. It works. I was struggling to find a real workload which actually cared. (The patch also exports page_cache_readahead() to GPL modules) Cc: "Stephen C. Tweedie" <sct@redhat.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Rafael J. Wysocki 提交于
It is unsafe to suspend devices if the hardware is controlled by X. Add an extra check to prevent this from happening. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Randy Dunlap 提交于
Move externs from C source files to header files. Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Rafael J. Wysocki 提交于
Introduce the low level interface that can be used for handling the snapshot of the system memory by the in-kernel swap-writing/reading code of swsusp and the userland interface code (to be introduced shortly). Also change the way in which swsusp records the allocated swap pages and, consequently, simplifies the in-kernel swap-writing/reading code (this is necessary for the userland interface too). To this end, it introduces two helper functions in mm/swapfile.c, so that the swsusp code does not refer directly to the swap internals. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Acked-by: NPavel Machek <pavel@ucw.cz> Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Markus Gutschke 提交于
Gcc reserves %ebx when compiling position-independent-code on i386. This means, the _syscallX() macros in include/asm-i386/unistd.h will not compile. This patch is changes the existing macros to take special care to preserve %ebx. The bug can be tracked at http://bugzilla.kernel.org/show_bug.cgi?id=6204Signed-off-by: NMarkus Gutschke <markus@google.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Chuck Ebbert 提交于
_raw_spin_lock_flags() is entered with interrupts disabled. If it cannot obtain a spinlock, it checks the flags that were passed and re-enables interrupts before spinning if that's how the flags are set. When the spinlock might be available, it disables interrupts (even if they are already disabled) before trying to get the lock. Change that so interrupts are only disabled if they have been enabled. This costs nine bytes of duplicated spinloop code. Fastpath before patch: jle <keep looping> not-taken conditional jump cli disable interrupts jmp <try for lock> unconditional jump Fastpath after patch, if interrupts were not enabled: jg <try for lock> taken conditional branch Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com> Acked-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Jesper Juhl 提交于
arch/i386/kernel/cpu/centaur.c: In function `centaur_mcr_insert': arch/i386/kernel/cpu/centaur.c:33: warning: implicit declaration of function `mtrr_centaur_report_mcr' Signed-off-by: NJesper Juhl <jesper.juhl@gmail.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Jan Beulich 提交于
>commit 76381fee >Author: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk> >Date: Thu Jun 23 00:08:46 2005 -0700 > > [PATCH] xen: x86_64: use more usermode macro > > Make use of the user_mode macro where it's possible. This is useful for Xen > because it will need only to redefine only the macro to a hypervisor call. I am of the opinion that the above changeset is incomplete, i.e. it missed converting some previous uses of user_mode to user_mode_vm. While most of them could be considered just cosmetical, at least the one in die_nmi doesn't appear to be. Signed-off-by: NJan Beulich <jbeulich@novell.com> Cc: Vincent Hanquez <vincent.hanquez@cl.cam.ac.uk> Cc: Zachary Amsden <zach@vmware.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Jan Beulich 提交于
Registering a callback handler through register_die_notifier() is obviously primarily intended for use by modules. However, the way these currently get called it is basically impossible for them to actually be used by modules, as there is, on non-PAE configurationes, a good chance (the larger the module, the better) for the system to crash as a result. This is because the callback gets invoked (a) in the page fault path before the top level page table propagation gets carried out (hence a fault to propagate the top level page table entry/entries mapping to module's code/data would nest infinitly) and (b) in the NMI path, where nested faults must absolutely not happen, since otherwise the IRET from the nested fault re-enables NMIs, potentially resulting in nested NMI occurences. Besides the modular aspect, similar problems would even arise for in- kernel consumers of the API if they touched ioremap()ed or vmalloc()ed memory inside their handlers. Signed-off-by: NJan Beulich <jbeulich@novell.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Stas Sergeev 提交于
The history is that -mm kernels do not work for me for a few months already. The things started from crashing somewhere after starting init, and for the last month - no boot at all, just "Uncompressing... OK, booting kernel", and silence. Early console didn't work too. With the latest releases this degraded into an infinite stream of the "Unknown interrupt or fault" messages. So today my patience ran out and I started to think how can I collect at least some info for the bug-report. Attached is the patch that allows to gather some valueable debug info on the problem by making an early console more useable. I can't properly test the patch, as the kernel still doesn't boot, so I'll explain it in details in a hope someone else can justify the intrusive changes. arch_hooks.h: added prototypes for setup_early_printk() and early_printk(). setup.c: killed wrong setup_early_printk() prototype. Moved setup_early_printk() a bit earlier, as it was not "early enough" to cover the bug I was fighting with. early_printk.c: made it to start printing from the bottom of the screen, otherwise the messages interfere with the ones of the boot-loader, so you can't read them. Signed-off-by: NStas Sergeev <stsp@aknet.ru> Cc: Andi Kleen <ak@muc.de> Cc: Zwane Mwaikambo <zwane@arm.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Chris Wright 提交于
mp_bus_id_to_pci_bus is declared identically twice. Signed-off-by: NChris Wright <chrisw@sous-sol.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
ES7000 platform code clean up for compilation errors and a warning. Ifdef'd the ACPI related parts in the ES7000 platform code. They were causing compile errors in certain configuration (without ACPI defined). I think this approach would be best (as opposed to Kconfig changes) since it only touches the subarch... Signed-off-by: <Natalie.Protasevich@unisys.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Eric W. Biederman 提交于
In some code I am developing I had occasion to change the type of a variable. This made the value put_user was putting to user space wrong. But the code continued to build cleanly without errors. Introducing a temporary fixes this problem and at least with gcc-3.3.5 does not cause gcc any problems with optimizing out the temporary. gcc-4.x using SSA internally ought to be even better at optimizing out temporaries, so I don't expect a temporary to become a problem. Especially because in all correct cases the types on both sides of the assignment to the temporary are the same. Signed-off-by: NEric W. Biederman <ebiederm@xmission.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Gerd Hoffmann 提交于
Implement SMP alternatives, i.e. switching at runtime between different code versions for UP and SMP. The code can patch both SMP->UP and UP->SMP. The UP->SMP case is useful for CPU hotplug. With CONFIG_CPU_HOTPLUG enabled the code switches to UP at boot time and when the number of CPUs goes down to 1, and switches to SMP when the number of CPUs goes up to 2. Without CONFIG_CPU_HOTPLUG or on non-SMP-capable systems the code is patched once at boot time (if needed) and the tables are released afterwards. The changes in detail: * The current alternatives bits are moved to a separate file, the SMP alternatives code is added there. * The patch adds some new elf sections to the kernel: .smp_altinstructions like .altinstructions, also contains a list of alt_instr structs. .smp_altinstr_replacement like .altinstr_replacement, but also has some space to save original instruction before replaving it. .smp_locks list of pointers to lock prefixes which can be nop'ed out on UP. The first two are used to replace more complex instruction sequences such as spinlocks and semaphores. It would be possible to deal with the lock prefixes with that as well, but by handling them as special case the table sizes become much smaller. * The sections are page-aligned and padded up to page size, so they can be free if they are not needed. * Splitted the code to release init pages to a separate function and use it to release the elf sections if they are unused. Signed-off-by: NGerd Hoffmann <kraxel@suse.de> Signed-off-by: NChuck Ebbert <76306.1226@compuserve.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 NeilBrown 提交于
Both R1BIO_Barrier and R1BIO_Returned are 4 !!!! This means that barrier requests don't get returned (i.e. b_endio called) because it looks like they already have been. Signed-off-by: NNeil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Dmitry Mishin 提交于
This patch moves {ip,ip6,arp}t_entry_{match,target} definitions to x_tables.h. This move simplifies code and future compatibility fixes. Signed-off-by: NDmitry Mishin <dim@openvz.org> Acked-off-by: NKirill Korotaev <dev@openvz.org> Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pablo Neira Ayuso 提交于
x_tables matches and targets that require nf_conntrack_ipv[4|6] to work don't have enough information to load on demand these modules. This patch introduces the following changes to solve this issue: o nf_ct_l3proto_try_module_get: try to load the layer 3 connection tracker module and increases the refcount. o nf_ct_l3proto_module put: drop the refcount of the module. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pablo Neira Ayuso 提交于
Set the family field in xt_[matches|targets] registered. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NPatrick McHardy <kaber@trash.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lennert Buytenhek 提交于
Patch from Lennert Buytenhek The original version of the chip docs had the PW and SU fields in the slowport write timing control register accidentally reversed. This is mentioned in the errata (documentation change #4) and fixed in newer docs. Signed-off-by: NLennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Lennert Buytenhek 提交于
Patch from Lennert Buytenhek On the IXDP2x00s, the NPU that is PCI master is always the egress (i.e. 'master') NPU. At least on the IXDP2800, both NPUs have flash, so the ixp2000_has_flash() check in ixdp2x00_master_npu() is useless. Signed-off-by: NLennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Lennert Buytenhek 提交于
Patch from Lennert Buytenhek The xscale UART in the ixp2000 is basically just an 8250 UART (with some extra bits and pieces), so we can use the generic 8250 debug macros on the ixp2000. Signed-off-by: NLennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 22 3月, 2006 9 次提交
-
-
由 Christoph Lameter 提交于
Centralize the page migration functions in anticipation of additional tinkering. Creates a new file mm/migrate.c 1. Extract buffer_migrate_page() from fs/buffer.c 2. Extract central migration code from vmscan.c 3. Extract some components from mempolicy.c 4. Export pageout() and remove_from_swap() from vmscan.c 5. Make it possible to configure NUMA systems without page migration and non-NUMA systems with page migration. I had to so some #ifdeffing in mempolicy.c that may need a cleanup. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 David Gibson 提交于
Quite a long time back, prepare_hugepage_range() replaced is_aligned_hugepage_range() as the callback from mm/mmap.c to arch code to verify if an address range is suitable for a hugepage mapping. is_aligned_hugepage_range() stuck around, but only to implement prepare_hugepage_range() on archs which didn't implement their own. Most archs (everything except ia64 and powerpc) used the same implementation of is_aligned_hugepage_range(). On powerpc, which implements its own prepare_hugepage_range(), the custom version was never used. In addition, "is_aligned_hugepage_range()" was a bad name, because it suggests it returns true iff the given range is a good hugepage range, whereas in fact it returns 0-or-error (so the sense is reversed). This patch cleans up by abolishing is_aligned_hugepage_range(). Instead prepare_hugepage_range() is defined directly. Most archs use the default version, which simply checks the given region is aligned to the size of a hugepage. ia64 and powerpc define custom versions. The ia64 one simply checks that the range is in the correct address space region in addition to being suitably aligned. The powerpc version (just as previously) checks for suitable addresses, and if necessary performs low-level MMU frobbing to set up new areas for use by hugepages. No libhugetlbfs testsuite regressions on ppc64 (POWER5 LPAR). Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NZhang Yanmin <yanmin.zhang@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 David Gibson 提交于
The optional hugepage callback, hugetlb_free_pgd_range() is presently implemented non-trivially only on ia64 (but I plan to add one for powerpc shortly). It has its own prototype for the function in asm-ia64/pgtable.h. However, since the function is called from generic code, it make sense for its prototype to be in the generic hugetlb.h header file, as the protypes other arch callbacks already are (prepare_hugepage_range(), set_huge_pte_at(), etc.). This patch makes it so. Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 David Gibson 提交于
free_pgtables() has special logic to call hugetlb_free_pgd_range() instead of the normal free_pgd_range() on hugepage VMAs. However, the test it uses to do so is incorrect: it calls is_hugepage_only_range on a hugepage sized range at the start of the vma. is_hugepage_only_range() will return true if the given range has any intersection with a hugepage address region, and in this case the given region need not be hugepage aligned. So, for example, this test can return true if called on, say, a 4k VMA immediately preceding a (nicely aligned) hugepage VMA. At present we get away with this because the powerpc version of hugetlb_free_pgd_range() is just a call to free_pgd_range(). On ia64 (the only other arch with a non-trivial is_hugepage_only_range()) we get away with it for a different reason; the hugepage area is not contiguous with the rest of the user address space, and VMAs are not permitted in between, so the test can't return a false positive there. Nonetheless this should be fixed. We do that in the patch below by replacing the is_hugepage_only_range() test with an explicit test of the VMA using is_vm_hugetlb_page(). This in turn changes behaviour for platforms where is_hugepage_only_range() returns false always (everything except powerpc and ia64). We address this by ensuring that hugetlb_free_pgd_range() is defined to be identical to free_pgd_range() (instead of a no-op) on everything except ia64. Even so, it will prevent some otherwise possible coalescing of calls down to free_pgd_range(). Since this only happens for hugepage VMAs, removing this small optimization seems unlikely to cause any trouble. This patch causes no regressions on the libhugetlbfs testsuite - ppc64 POWER5 (8-way), ppc64 G5 (2-way) and i386 Pentium M (UP). Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Acked-by: NHugh Dickins <hugh@veritas.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 David Gibson 提交于
Originally, mm/hugetlb.c just handled the hugepage physical allocation path and its {alloc,free}_huge_page() functions were used from the arch specific hugepage code. These days those functions are only used with mm/hugetlb.c itself. Therefore, this patch makes them static and removes their prototypes from hugetlb.h. This requires a small rearrangement of code in mm/hugetlb.c to avoid a forward declaration. This patch causes no regressions on the libhugetlbfs testsuite (ppc64, POWER5). Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 David Gibson 提交于
These days, hugepages are demand-allocated at first fault time. There's a somewhat dubious (and racy) heuristic when making a new mmap() to check if there are enough available hugepages to fully satisfy that mapping. A particularly obvious case where the heuristic breaks down is where a process maps its hugepages not as a single chunk, but as a bunch of individually mmap()ed (or shmat()ed) blocks without touching and instantiating the pages in between allocations. In this case the size of each block is compared against the total number of available hugepages. It's thus easy for the process to become overcommitted, because each block mapping will succeed, although the total number of hugepages required by all blocks exceeds the number available. In particular, this defeats such a program which will detect a mapping failure and adjust its hugepage usage downward accordingly. The patch below addresses this problem, by strictly reserving a number of physical hugepages for hugepage inodes which have been mapped, but not instatiated. MAP_SHARED mappings are thus "safe" - they will fail on mmap(), not later with an OOM SIGKILL. MAP_PRIVATE mappings can still trigger an OOM. (Actually SHARED mappings can technically still OOM, but only if the sysadmin explicitly reduces the hugepage pool between mapping and instantiation) This patch appears to address the problem at hand - it allows DB2 to start correctly, for instance, which previously suffered the failure described above. This patch causes no regressions on the libhugetblfs testsuite, and makes a test (designed to catch this problem) pass which previously failed (ppc64, POWER5). Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Zhang, Yanmin 提交于
2.6.16-rc3 uses hugetlb on-demand paging, but it doesn_t support hugetlb mprotect. From: David Gibson <david@gibson.dropbear.id.au> Remove a test from the mprotect() path which checks that the mprotect()ed range on a hugepage VMA is hugepage aligned (yes, really, the sense of is_aligned_hugepage_range() is the opposite of what you'd guess :-/). In fact, we don't need this test. If the given addresses match the beginning/end of a hugepage VMA they must already be suitably aligned. If they don't, then mprotect_fixup() will attempt to split the VMA. The very first test in split_vma() will check for a badly aligned address on a hugepage VMA and return -EINVAL if necessary. From: "Chen, Kenneth W" <kenneth.w.chen@intel.com> On i386 and x86-64, pte flag _PAGE_PSE collides with _PAGE_PROTNONE. The identify of hugetlb pte is lost when changing page protection via mprotect. A page fault occurs later will trigger a bug check in huge_pte_alloc(). The fix is to always make new pte a hugetlb pte and also to clean up legacy code where _PAGE_PRESENT is forced on in the pre-faulting day. Signed-off-by: NZhang Yanmin <yanmin.zhang@intel.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: NKen Chen <kenneth.w.chen@intel.com> Signed-off-by: NNishanth Aravamudan <nacc@us.ibm.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Nick Piggin 提交于
Optimise page_count compound page test and make it consistent with similar functions. Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
由 Nick Piggin 提交于
set_page_count usage outside mm/ is limited to setting the refcount to 1. Remove set_page_count from outside mm/, and replace those users with init_page_count() and set_page_refcounted(). This allows more debug checking, and tighter control on how code is allowed to play around with page->_count. Signed-off-by: NNick Piggin <npiggin@suse.de> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-