- 09 5月, 2007 21 次提交
-
-
由 Andrew Morton 提交于
console.name[] is eight chars, but so is "earlyvga". So when we try to print console->name when using earlyvga it runs off the end of the string. Make it bigger. Diagnosed-by: NGerd Hoffmann <kraxel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rusty Russell 提交于
lguest uses the convenient futex infrastructure for inter-domain I/O, so expose get_futex_key, get_key_refs (renamed get_futex_key_refs) and drop_key_refs (renamed drop_futex_key_refs). Also means we need to expose the union that these use. No code changes. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Klaus Kudielka 提交于
Switch from private uclong, etc over to standard types. Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Klaus Kudielka 提交于
At least on x86_64 the present cyclades.h is broken due to the wrong size of uclong. This affects, of course, both the kernel and the user-level utilities. The symptom is that cyzload refuses to load the firmware. I also managed to freeze the machine when unloading the module. The patch below fixes this in an architecture-independent way. I have tested it with 2.6.19 and the driver works fine again with a Cyclades-Z on an Athlon 64 X2. [akpm@linux-foundation.org: fix warnings] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
Kprobes doesn't scribble the kprobe.symbol_name field. Its only set by the module when registering the probe. Modules that exercise good hygiene using the "const" qualifier will see warnings... warning: assignment discards qualifiers from pointer target type Make struct kprobe.symbol_name const char * Signed-off-by: NAnanth N Mavinakayanahalli <ananth@in.ibm.com> Signed-off-by: NJim Keniston <jkenisto@us.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Eric Dumazet 提交于
1) Introduces a new method in 'struct dentry_operations'. This method called d_dname() might be called from d_path() to build a pathname for special filesystems. It is called without locks. Future patches (if we succeed in having one common dentry for all pipes/sockets) may need to change prototype of this method, but we now use : char *d_dname(struct dentry *dentry, char *buffer, int buflen); 2) Adds a dynamic_dname() helper function that eases d_dname() implementations 3) Defines d_dname method for sockets : No more sprintf() at socket creation. This is delayed up to the moment someone does an access to /proc/pid/fd/... 4) Defines d_dname method for pipes : No more sprintf() at pipe creation. This is delayed up to the moment someone does an access to /proc/pid/fd/... A benchmark consisting of 1.000.000 calls to pipe()/close()/close() gives a *nice* speedup on my Pentium(M) 1.6 Ghz : 3.090 s instead of 3.450 s Signed-off-by: NEric Dumazet <dada1@cosmosbay.com> Acked-by: NChristoph Hellwig <hch@infradead.org> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
proc_lookup remove_proc_entry =========== ================= lock_kernel(); spin_lock(&proc_subdir_lock); [find PDE with refcount 0] spin_unlock(&proc_subdir_lock); spin_lock(&proc_subdir_lock); [find PDE with refcount 0] [check refcount and free PDE] spin_unlock(&proc_subdir_lock); proc_get_inode: de_get(de); /* boom */ Signed-off-by: NAlexey Dobriyan <adobriyan@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Miklos Szeredi 提交于
There's a slight problem with filesystem type representation in fuse based filesystems. From the kernel's view, there are just two filesystem types: fuse and fuseblk. From the user's view there are lots of different filesystem types. The user is not even much concerned if the filesystem is fuse based or not. So there's a conflict of interest in how this should be represented in fstab, mtab and /proc/mounts. The current scheme is to encode the real filesystem type in the mount source. So an sshfs mount looks like this: sshfs#user@server:/ /mnt/server fuse rw,nosuid,nodev,... This url-ish syntax works OK for sshfs and similar filesystems. However for block device based filesystems (ntfs-3g, zfs) it doesn't work, since the kernel expects the mount source to be a real device name. A possibly better scheme would be to encode the real type in the type field as "type.subtype". So fuse mounts would look like this: /dev/hda1 /mnt/windows fuseblk.ntfs-3g rw,... user@server:/ /mnt/server fuse.sshfs rw,nosuid,nodev,... This patch adds the necessary code to the kernel so that this can be correctly displayed in /proc/mounts. Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Brownell 提交于
PNP now initializes device dma masks, which prevents oopses when generic dma calls are made using pnp device nodes. This assumes PNP only uses ISA DMA, with 24 bit addresses; and that it's safe to init those masks for all devices (rather than finding out which devices have been assigned DMA channels, and handling only those). Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net> Cc: Adam Belay <abelay@novell.com> Cc: Jaroslav Kysela <perex@suse.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Badari Pulavarty 提交于
sys_clone() and sys_unshare() both makes copies of nsproxy and its associated namespaces. But they have different code paths. This patch merges all the nsproxy and its associated namespace copy/clone handling (as much as possible). Posted on container list earlier for feedback. - Create a new nsproxy and its associated namespaces and pass it back to caller to attach it to right process. - Changed all copy_*_ns() routines to return a new copy of namespace instead of attaching it to task->nsproxy. - Moved the CAP_SYS_ADMIN checks out of copy_*_ns() routines. - Removed unnessary !ns checks from copy_*_ns() and added BUG_ON() just incase. - Get rid of all individual unshare_*_ns() routines and make use of copy_*_ns() instead. [akpm@osdl.org: cleanups, warning fix] [clg@fr.ibm.com: remove dup_namespaces() declaration] [serue@us.ibm.com: fix CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) retval] [akpm@linux-foundation.org: fix build with CONFIG_SYSVIPC=n] Signed-off-by: NBadari Pulavarty <pbadari@us.ibm.com> Signed-off-by: NSerge Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <containers@lists.osdl.org> Signed-off-by: NCedric Le Goater <clg@fr.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Monakhov Dmitriy 提交于
This could help to find buggy drivers where request_irq return value wasn't checked. There's just no reason to ignore errors which can and do occur. Anyone who got warning during compilation have to realise what it is't realy safe code. Signed-off-by: NMonakhov Dmitriy <dmonakhov@openvz.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Alexey Dobriyan 提交于
This makes in-core superblock fit into one cacheline here. Before: struct dentry * xattr_root; /* 124 4 */ /* --- cacheline 1 boundary (128 bytes) --- */ struct rw_semaphore xattr_dir_sem; /* 128 12 */ int j_errno; /* 140 4 */ }; /* size: 144, cachelines: 2 */ /* sum members: 142, holes: 1, sum holes: 2 */ /* last cacheline: 16 bytes */ After: int j_errno; /* 124 4 */ /* --- cacheline 1 boundary (128 bytes) --- */ }; /* size: 128, cachelines: 1 */ /* sum members: 126, holes: 1, sum holes: 2 */ Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Schmidt 提交于
It is sometimes useful to compile individual drivers with optimization disabled for easier debugging. Currently drivers which use htonl() and similar functions don't compile with -O0. This patch fixes it. It also removes obsolete and misleading comments. This header is not for userspace, so we don't have to care about strange programs these comments mention. (akpm: -O0 probably isn't a good idea, but this code looks pretty crufty and unuseful) Signed-off-by: NMichal Schmidt <mschmidt@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Adrian Bunk 提交于
Add a proper protype for prepare_namespace() in include/linux/init.h. Signed-off-by: NAdrian Bunk <bunk@stusta.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Chris Snook 提交于
Add SEEK_MAX and use it to validate lseek arguments from userspace. Signed-off-by: NChris Snook <csnook@redhat.com> Acked-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Trent Piepho 提交于
Constant folding does not work for the swabXX() byte swapping functions, and the C versions optimize poorly. Attempting to initialize a global variable to swab16(0x1234) or put something like "case swab32(42):" in a switch statement will not compile. It can work, swab.h just isn't doing it correctly. This patch fixes that. Contrary to the comment in asm-i386/byteorder.h, gcc does not recognize the "C" version of swab16 and turn it into efficient code. gcc can do this, just not with the current code. The simple function: u16 foo(u16 x) { return swab16(x); } Would compile to: movzwl %ax, %eax movl %eax, %edx shrl $8, %eax sall $8, %edx orl %eax, %edx With this patch, it will compile to: rolw $8, %ax I also attempted to document the maze different macros/inline functions that are used to create the final product. Signed-off-by: NTrent Piepho <xyzzy@speakeasy.org> Cc: Francois-Rene Rideau <fare@tunes.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 William Cohen 提交于
This past week I was playing around with that pahole tool (http://oops.ghostprotocols.net:81/acme/dwarves/) and looking at the size of various struct in the kernel. I was surprised by the size of the task_struct on x86_64, approaching 4K. I looked through the fields in task_struct and found that a number of them were declared as "unsigned long" rather than "unsigned int" despite them appearing okay as 32-bit sized fields. On x86_64 "unsigned long" ends up being 8 bytes in size and forces 8 byte alignment. Is there a reason there a reason they are "unsigned long"? The patch below drops the size of the struct from 3808 bytes (60 64-byte cachelines) to 3760 bytes (59 64-byte cachelines). A couple other fields in the task struct take a signficant amount of space: struct thread_struct thread; 688 struct held_lock held_locks[30]; 1680 CONFIG_LOCKDEP is turned on in the .config [akpm@linux-foundation.org: fix printk warnings] Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Hellwig 提交于
Simplify the stacktrace code: - remove the unused task argument to save_stack_trace, it's always current - remove the all_contexts flag, it's alwasy 0 Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <ak@suse.de> Cc: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: NIngo Molnar <mingo@elte.hu> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Guillaume Chazarain 提交于
Cleanup: setting an outstanding error on a mapping was open coded too many times. Factor it out in mapping_set_error(). Signed-off-by: NGuillaume Chazarain <guichaz@yahoo.fr> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Dmitriy Monakhov 提交于
[akpm@linux-foundation.org: cleanup] Signed-off-by: NMonakhov Dmitriy <dmonakhov@openvz.org> Cc: Christoph Hellwig <hch@lst.de> Acked-by: NAnton Altaparmakov <aia21@cam.ac.uk> Acked-by: NDavid Chinner <dgc@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Woodhouse 提交于
There are two problems with the existing redzone implementation. Firstly, it's causing misalignment of structures which contain a 64-bit integer, such as netfilter's 'struct ipt_entry' -- causing netfilter modules to fail to load because of the misalignment. (In particular, the first check in net/ipv4/netfilter/ip_tables.c::check_entry_size_and_hooks()) On ppc32 and sparc32, amongst others, __alignof__(uint64_t) == 8. With slab debugging, we use 32-bit redzones. And allocated slab objects aren't sufficiently aligned to hold a structure containing a uint64_t. By _just_ setting ARCH_KMALLOC_MINALIGN to __alignof__(u64) we'd disable redzone checks on those architectures. By using 64-bit redzones we avoid that loss of debugging, and also fix the other problem while we're at it. When investigating this, I noticed that on 64-bit platforms we're using a 32-bit value of RED_ACTIVE/RED_INACTIVE in the 64-bit memory location set aside for the redzone. Which means that the four bytes immediately before or after the allocated object at 0x00,0x00,0x00,0x00 for LE and BE machines, respectively. Which is probably not the most useful choice of poison value. One way to fix both of those at once is just to switch to 64-bit redzones in all cases. Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org> Acked-by: NPekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <clameter@engr.sgi.com> Acked-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 08 5月, 2007 19 次提交
-
-
Improve checking and diagnostics for broadcast and multicast Ethernet MAC addresses, and distinguish between those cases in output; also make sure the device is assigned a MAC address valid only locally to avoid collisions. Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: NJeff Dike <jdike@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rusty Russell 提交于
We can use a gcc extension to ensure that ARRAY_SIZE() is handed an array, not a pointer. This is especially important when code is changed from a fixed array to a pointer. I assume the Intel compiler doesn't support __builtin_types_compatible_p. [jdike@addtoit.com: uml: update UML definition of ARRAY_SIZE] Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NJeff Dike <jdike@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rafael J. Wysocki 提交于
Move the definition of PAGES_FOR_IO to kernel/power/power.h and introduce SPARE_PAGES representing the number of pages that should be freed by the swsusp's memory shrinker in addition to PAGES_FOR_IO so that device drivers can allocate some memory (up to 1 MB total) in their .suspend() routines without causing the suspend to fail. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Acked-by: NPavel Machek <pavel@ucw.cz> Cc: Nigel Cunningham <nigel@nigel.suspend2.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Johannes Berg 提交于
Remove software_suspend() and all its users since pm_suspend(PM_SUSPEND_DISK) should be equivalent and there's no point in having two interfaces for the same thing. The patch also changes the valid_state function to return 0 (false) for PM_SUSPEND_DISK when SOFTWARE_SUSPEND is not configured instead of accepting it and having the whole thing fail later. Signed-off-by: NJohannes Berg <johannes@sipsolutions.net> Acked-by: N"Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rafael J. Wysocki 提交于
Remove the two page flags that were previously used by swsusp and are no longer needed. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Acked-by: NPavel Machek <pavel@ucw.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rafael J. Wysocki 提交于
Make swsusp use memory bitmaps instead of page flags for marking 'nosave' and free pages. This allows us to 'recycle' two page flags that can be used for other purposes. Also, the memory needed to store the bitmaps is allocated when necessary (ie. before the suspend) and freed after the resume which is more reasonable. The patch is designed to minimize the amount of changes and there are some nice simplifications and optimizations possible on top of it. I am going to implement them separately in the future. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Acked-by: NPavel Machek <pavel@ucw.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Rafael J. Wysocki 提交于
Replace direct invocations of SetPageNosave(), SetPageNosaveFree() etc. with calls to inline functions that can be changed in subsequent patches without modifying the code calling them. Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl> Acked-by: NPavel Machek <pavel@ucw.cz> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bryan Wu 提交于
This patch implements the driver necessary use the Analog Devices Blackfin processor's Serial Port. Signed-off-by: NBryan Wu <bryan.wu@analog.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Russell King <rmk+lkml@arm.linux.org.uk> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Bryan Wu 提交于
This adds support for the Analog Devices Blackfin processor architecture, and currently supports the BF533, BF532, BF531, BF537, BF536, BF534, and BF561 (Dual Core) devices, with a variety of development platforms including those avaliable from Analog Devices (BF533-EZKit, BF533-STAMP, BF537-STAMP, BF561-EZKIT), and Bluetechnix! Tinyboards. The Blackfin architecture was jointly developed by Intel and Analog Devices Inc. (ADI) as the Micro Signal Architecture (MSA) core and introduced it in December of 2000. Since then ADI has put this core into its Blackfin processor family of devices. The Blackfin core has the advantages of a clean, orthogonal,RISC-like microprocessor instruction set. It combines a dual-MAC (Multiply/Accumulate), state-of-the-art signal processing engine and single-instruction, multiple-data (SIMD) multimedia capabilities into a single instruction-set architecture. The Blackfin architecture, including the instruction set, is described by the ADSP-BF53x/BF56x Blackfin Processor Programming Reference http://blackfin.uclinux.org/gf/download/frsrelease/29/2549/Blackfin_PRM.pdf The Blackfin processor is already supported by major releases of gcc, and there are binary and source rpms/tarballs for many architectures at: http://blackfin.uclinux.org/gf/project/toolchain/frs There is complete documentation, including "getting started" guides available at: http://docs.blackfin.uclinux.org/ which provides links to the sources and patches you will need in order to set up a cross-compiling environment for bfin-linux-uclibc This patch, as well as the other patches (toolchain, distribution, uClibc) are actively supported by Analog Devices Inc, at: http://blackfin.uclinux.org/ We have tested this on LTP, and our test plan (including pass/fails) can be found at: http://docs.blackfin.uclinux.org/doku.php?id=testing_the_linux_kernel [m.kozlowski@tuxland.pl: balance parenthesis in blackfin header files] Signed-off-by: NBryan Wu <bryan.wu@analog.com> Signed-off-by: NMariusz Kozlowski <m.kozlowski@tuxland.pl> Signed-off-by: NAubrey Li <aubrey.li@analog.com> Signed-off-by: NJie Zhang <jie.zhang@analog.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
Address spaces contain an allocation flag that specifies restriction on the zone for pages placed in the mapping. I.e. some device may require pages to be allocated from a DMA zone. Block devices may not be able to use pages from HIGHMEM. Memory policies and the common use of page migration works only on the highest zone. If the address space does not allow allocation from the highest zone then the pages in the address space are not migratable simply because we can only allocate memory for a specified node if we allow allocation for the highest zone on each node. Acked-by: NHugh Dickins <hugh@veritas.com> Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
There is no user remaining and I have never seen any use of that flag. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
SLAB_CTOR atomic is never used which is no surprise since I cannot imagine that one would want to do something serious in a constructor or destructor. In particular given that the slab allocators run with interrupts disabled. Actions in constructors and destructors are by their nature very limited and usually do not go beyond initializing variables and list operations. (The i386 pgd ctor and dtors do take a spinlock in constructor and destructor..... I think that is the furthest we go at this point.) There is no flag passed to the destructor so removing SLAB_CTOR_ATOMIC also establishes a certain symmetry. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by SLAB. I think its purpose was to have a callback after an object has been freed to verify that the state is the constructor state again? The callback is performed before each freeing of an object. I would think that it is much easier to check the object state manually before the free. That also places the check near the code object manipulation of the object. Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was compiled with SLAB debugging on. If there would be code in a constructor handling SLAB_DEBUG_INITIAL then it would have to be conditional on SLAB_DEBUG otherwise it would just be dead code. But there is no such code in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real use of, difficult to understand and there are easier ways to accomplish the same effect (i.e. add debug code before kfree). There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be clear in fs inode caches. Remove the pointless checks (they would even be pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors. This is the last slab flag that SLUB did not support. Remove the check for unimplemented flags from SLUB. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
This patch provides a new macro KMEM_CACHE(<struct>, <flags>) to simplify slab creation. KMEM_CACHE creates a slab with the name of the struct, with the size of the struct and with the alignment of the struct. Additional slab flags may be specified if necessary. Example struct test_slab { int a,b,c; struct list_head; } __cacheline_aligned_in_smp; test_slab_cache = KMEM_CACHE(test_slab, SLAB_PANIC) will create a new slab named "test_slab" of the size sizeof(struct test_slab) and aligned to the alignment of test slab. If it fails then we panic. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
This patch was recently posted to lkml and acked by Pekka. The flag SLAB_MUST_HWCACHE_ALIGN is 1. Never checked by SLAB at all. 2. A duplicate of SLAB_HWCACHE_ALIGN for SLUB 3. Fulfills the role of SLAB_HWCACHE_ALIGN for SLOB. The only remaining use is in sparc64 and ppc64 and their use there reflects some earlier role that the slab flag once may have had. If its specified then SLAB_HWCACHE_ALIGN is also specified. The flag is confusing, inconsistent and has no purpose. Remove it. Acked-by: NPekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Peter Zijlstra 提交于
Remove duplicate work in kill_bdev(). It currently invalidates and then truncates the bdev's mapping. invalidate_mapping_pages() will opportunistically remove pages from the mapping. And truncate_inode_pages() will forcefully remove all pages. The only thing truncate doesn't do is flush the bh lrus. So do that explicitly. This avoids (very unlikely) but possible invalid lookup results if the same bdev is quickly re-issued. It also will prevent extreme kernel latencies which are observed when blockdevs which have a large amount of pagecache are unmounted, by avoiding invalidate_mapping_pages() on that path. invalidate_mapping_pages() has no cond_resched (it can be called under spinlock), whereas truncate_inode_pages() has one. [akpm@linux-foundation.org: restore nrpages==0 optimisation] Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Peter Zijlstra 提交于
Remove the destroy_dirty_buffers argument from invalidate_bdev(), it hasn't been used in 6 years (so akpm says). find * -name \*.[ch] | xargs grep -l invalidate_bdev | while read file; do quilt add $file; sed -ie 's/invalidate_bdev(\([^,]*\),[^)]*)/invalidate_bdev(\1)/g' $file; done Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
On x86_64 this cuts allocation overhead for page table pages down to a fraction (kernel compile / editing load. TSC based measurement of times spend in each function): no quicklist pte_alloc 1569048 4.3s(401ns/2.7us/179.7us) pmd_alloc 780988 2.1s(337ns/2.7us/86.1us) pud_alloc 780072 2.2s(424ns/2.8us/300.6us) pgd_alloc 260022 1s(920ns/4us/263.1us) quicklist: pte_alloc 452436 573.4ms(8ns/1.3us/121.1us) pmd_alloc 196204 174.5ms(7ns/889ns/46.1us) pud_alloc 195688 172.4ms(7ns/881ns/151.3us) pgd_alloc 65228 9.8ms(8ns/150ns/6.1us) pgd allocations are the most complex and there we see the most dramatic improvement (may be we can cut down the amount of pgds cached somewhat?). But even the pte allocations still see a doubling of performance. 1. Proven code from the IA64 arch. The method used here has been fine tuned for years and is NUMA aware. It is based on the knowledge that accesses to page table pages are sparse in nature. Taking a page off the freelists instead of allocating a zeroed pages allows a reduction of number of cachelines touched in addition to getting rid of the slab overhead. So performance improves. This is particularly useful if pgds contain standard mappings. We can save on the teardown and setup of such a page if we have some on the quicklists. This includes avoiding lists operations that are otherwise necessary on alloc and free to track pgds. 2. Light weight alternative to use slab to manage page size pages Slab overhead is significant and even page allocator use is pretty heavy weight. The use of a per cpu quicklist means that we touch only two cachelines for an allocation. There is no need to access the page_struct (unless arch code needs to fiddle around with it). So the fast past just means bringing in one cacheline at the beginning of the page. That same cacheline may then be used to store the page table entry. Or a second cacheline may be used if the page table entry is not in the first cacheline of the page. The current code will zero the page which means touching 32 cachelines (assuming 128 byte). We get down from 32 to 2 cachelines in the fast path. 3. x86_64 gets lightweight page table page management. This will allow x86_64 arch code to faster repopulate pgds and other page table entries. The list operations for pgds are reduced in the same way as for i386 to the point where a pgd is allocated from the page allocator and when it is freed back to the page allocator. A pgd can pass through the quicklists without having to be reinitialized. 64 Consolidation of code from multiple arches So far arches have their own implementation of quicklist management. This patch moves that feature into the core allowing an easier maintenance and consistent management of quicklists. Page table pages have the characteristics that they are typically zero or in a known state when they are freed. This is usually the exactly same state as needed after allocation. So it makes sense to build a list of freed page table pages and then consume the pages already in use first. Those pages have already been initialized correctly (thus no need to zero them) and are likely already cached in such a way that the MMU can use them most effectively. Page table pages are used in a sparse way so zeroing them on allocation is not too useful. Such an implementation already exits for ia64. Howver, that implementation did not support constructors and destructors as needed by i386 / x86_64. It also only supported a single quicklist. The implementation here has constructor and destructor support as well as the ability for an arch to specify how many quicklists are needed. Quicklists are defined by an arch defining CONFIG_QUICKLIST. If more than one quicklist is necessary then we can define NR_QUICK for additional lists. F.e. i386 needs two and thus has config NR_QUICK int default 2 If an arch has requested quicklist support then pages can be allocated from the quicklist (or from the page allocator if the quicklist is empty) via: quicklist_alloc(<quicklist-nr>, <gfpflags>, <constructor>) Page table pages can be freed using: quicklist_free(<quicklist-nr>, <destructor>, <page>) Pages must have a definite state after allocation and before they are freed. If no constructor is specified then pages will be zeroed on allocation and must be zeroed before they are freed. If a constructor is used then the constructor will establish a definite page state. F.e. the i386 and x86_64 pgd constructors establish certain mappings. Constructors and destructors can also be used to track the pages. i386 and x86_64 use a list of pgds in order to be able to dynamically update standard mappings. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Christoph Lameter 提交于
If slab tracking is on then build a list of full slabs so that we can verify the integrity of all slabs and are also able to built list of alloc/free callers. Signed-off-by: NChristoph Lameter <clameter@sgi.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-