- 21 9月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
These are similar to {get,put}_cpu_var() except for dynamically allocated per-cpu memory. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NTejun Heo <tj@kernel.org> LKML-Reference: <20100917093009.252867712@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 07 8月, 2010 1 次提交
-
-
由 Namhyung Kim 提交于
UP accessors didn't take care of __percpu notations leading to a lot of spurious sparse warnings on UP configurations. Fix it. Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 28 6月, 2010 2 次提交
-
-
由 Tejun Heo 提交于
This patch updates percpu allocator such that it can serve limited amount of allocation before slab comes online. This is primarily to allow slab to depend on working percpu allocator. Two parameters, PERCPU_DYNAMIC_EARLY_SIZE and SLOTS, determine how much memory space and allocation map slots are reserved. If this reserved area is exhausted, WARN_ON_ONCE() will trigger and allocation will fail till slab comes online. The following changes are made to implement early alloc. * pcpu_mem_alloc() now checks slab_is_available() * Chunks are allocated using pcpu_mem_alloc() * Init paths make sure ai->dyn_size is at least as large as PERCPU_DYNAMIC_EARLY_SIZE. * Initial alloc maps are allocated in __initdata and copied to kmalloc'd areas once slab is online. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org>
-
由 Tejun Heo 提交于
In pcpu_build_alloc_info() and pcpu_embed_first_chunk(), @dyn_size was ssize_t, -1 meant auto-size, 0 forced 0 and positive meant minimum size. There's no use case for forcing 0 and the upcoming early alloc support always requires non-zero dynamic size. Make @dyn_size always mean minimum dyn_size. While at it, make pcpu_build_alloc_info() static which doesn't have any external caller as suggested by David Rientjes. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com>
-
- 30 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
percpu.h has always been including slab.h to get k[mz]alloc/free() for UP inline implementation. percpu.h being used by very low level headers including module.h and sched.h, this meant that a lot files unintentionally got slab.h inclusion. Lee Schermerhorn was trying to make topology.h use percpu.h and got bitten by this implicit inclusion. The right thing to do is break this ultimately unnecessary dependency. The previous patch added explicit inclusion of either gfp.h or slab.h to the source files using them. This patch updates percpu.h such that slab.h is no longer included from percpu.h. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
- 29 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
lockdep has custom code to check whether a pointer belongs to static percpu area which is somewhat broken. Implement proper is_kernel/module_percpu_address() and replace the custom code. On UP, percpu variables are regular static variables and can't be distinguished from them. Always return %false on UP. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com>
-
- 02 12月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
Commit 3b034b0d implemented per_cpu_ptr_to_phys() but forgot to add UP definition. Add UP definition which is simple wrapper around __pa(). Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
-
- 25 11月, 2009 1 次提交
-
-
由 Vivek Goyal 提交于
o kdump functionality reserves a per cpu area at boot time and exports the physical address of that area to user space through sys interface. This area stores some dump related information like cpu register states etc at the time of crash. o We were assuming that per cpu area always come from linearly mapped meory region and using __pa() to determine physical address. With percpu_alloc=page, per cpu area can come from vmalloc region also and __pa() breaks. o This patch implments a new function to convert per cpu address to physical address. Before the patch, crash_notes addresses looked as follows. cpu0 60fffff49800 cpu1 60fffff60800 cpu2 60fffff77800 These are bogus phsyical addresses. After the patch, address are following. cpu0 13eb44000 cpu1 13eb43000 cpu2 13eb42000 cpu3 13eb41000 These look fine. I got 4G of memory and /proc/iomem tell me following. 100000000-13fffffff : System RAM tj: * added missing asm/io.h include reported by Stephen Rothwell * repositioned per_cpu_ptr_phys() in percpu.c and added comment. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
-
- 29 10月, 2009 6 次提交
-
-
由 Tejun Heo 提交于
The previous patch made sparse warn about percpu variables being used directly without going through percpu accessors. This patch implements the other half - checking whether non percpu variable is passed into percpu accessors. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Al Viro <viro@zeniv.linux.org.uk>
-
由 Rusty Russell 提交于
We have to make __kernel "__attribute__((address_space(0)))" so we can cast to it. tj: * put_cpu_var() update. * Annotations added to dynamic allocator interface. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Now that per_cpu__ prefix is gone, there's no distinction between static and dynamic percpu variables. Make get_cpu_var() take dynamic percpu variables and ensure that all macros have parentheses around the parameter evaluation and evaluate the variable parameter only once such that any expression which evaluates to percpu address can be used safely. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Rusty Russell 提交于
Now that the return from alloc_percpu is compatible with the address of per-cpu vars, it makes sense to hand around the address of per-cpu variables. To make this sane, we remove the per_cpu__ prefix we used created to stop people accidentally using these vars directly. Now we have sparse, we can use that (next patch). tj: * Updated to convert stuff which were missed by or added after the original patch. * Kill per_cpu_var() macro. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
-
由 Tejun Heo 提交于
Make the following changes to remove some sparse warnings. * Make DEFINE_PER_CPU_SECTION() declare __pcpu_unique_* before defining it. * Annotate pcpu_extend_area_map() that it is entered with pcpu_lock held, releases it and then reacquires it. * Make percpu related macros use unique nested variable names. * While at it, add pcpu prefix to __size_call[_return]() macros as to-be-implemented sparse annotations will add percpu specific stuff to these macros. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au>
-
由 Tejun Heo 提交于
alloc_percpu() couldn't handle array types like "int [100]" due to the way return type was casted. Fix it by using typeof() instead. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
-
- 03 10月, 2009 1 次提交
-
-
由 Christoph Lameter 提交于
This patch introduces two things: First this_cpu_ptr and then per cpu atomic operations. this_cpu_ptr ------------ A common operation when dealing with cpu data is to get the instance of the cpu data associated with the currently executing processor. This can be optimized by this_cpu_ptr(xx) = per_cpu_ptr(xx, smp_processor_id). The problem with per_cpu_ptr(x, smp_processor_id) is that it requires an array lookup to find the offset for the cpu. Processors typically have the offset for the current cpu area in some kind of (arch dependent) efficiently accessible register or memory location. We can use that instead of doing the array lookup to speed up the determination of the address of the percpu variable. This is particularly significant because these lookups occur in performance critical paths of the core kernel. this_cpu_ptr() can avoid memory accesses and this_cpu_ptr comes in two flavors. The preemption context matters since we are referring the the currently executing processor. In many cases we must insure that the processor does not change while a code segment is executed. __this_cpu_ptr -> Do not check for preemption context this_cpu_ptr -> Check preemption context The parameter to these operations is a per cpu pointer. This can be the address of a statically defined per cpu variable (&per_cpu_var(xxx)) or the address of a per cpu variable allocated with the per cpu allocator. per cpu atomic operations: this_cpu_*(var, val) ----------------------------------------------- this_cpu_* operations (like this_cpu_add(struct->y, value) operate on abitrary scalars that are members of structures allocated with the new per cpu allocator. They can also operate on static per_cpu variables if they are passed to per_cpu_var() (See patch to use this_cpu_* operations for vm statistics). These operations are guaranteed to be atomic vs preemption when modifying the scalar. The calculation of the per cpu offset is also guaranteed to be atomic at the same time. This means that a this_cpu_* operation can be safely used to modify a per cpu variable in a context where interrupts are enabled and preemption is allowed. Many architectures can perform such a per cpu atomic operation with a single instruction. Note that the atomicity here is different from regular atomic operations. Atomicity is only guaranteed for data accessed from the currently executing processor. Modifications from other processors are still possible. There must be other guarantees that the per cpu data is not modified from another processor when using these instruction. The per cpu atomicity is created by the fact that the processor either executes and instruction or not. Embedded in the instruction is the relocation of the per cpu address to the are reserved for the current processor and the RMW action. Therefore interrupts or preemption cannot occur in the mids of this processing. Generic fallback functions are used if an arch does not define optimized this_cpu operations. The functions come also come in the two flavors used for this_cpu_ptr(). The firstparameter is a scalar that is a member of a structure allocated through allocpercpu or a per cpu variable (use per_cpu_var(xxx)). The operations are similar to what percpu_add() and friends do. this_cpu_read(scalar) this_cpu_write(scalar, value) this_cpu_add(scale, value) this_cpu_sub(scalar, value) this_cpu_inc(scalar) this_cpu_dec(scalar) this_cpu_and(scalar, value) this_cpu_or(scalar, value) this_cpu_xor(scalar, value) Arch code can override the generic functions and provide optimized atomic per cpu operations. These atomic operations must provide both the relocation (x86 does it through a segment override) and the operation on the data in a single instruction. Otherwise preempt needs to be disabled and there is no gain from providing arch implementations. A third variant is provided prefixed by irqsafe_. These variants are safe against hardware interrupts on the *same* processor (all per cpu atomic primitives are *always* *only* providing safety for code running on the *same* processor!). The increment needs to be implemented by the hardware in such a way that it is a single RMW instruction that is either processed before or after an interrupt. cc: David Howells <dhowells@redhat.com> cc: Ingo Molnar <mingo@elte.hu> cc: Rusty Russell <rusty@rustcorp.com.au> cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: NChristoph Lameter <cl@linux-foundation.org> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 02 10月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
With ia64 converted, there's no arch left which still uses legacy percpu allocator. Kill it. Signed-off-by: NTejun Heo <tj@kernel.org> Delightedly-acked-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Lameter <cl@linux-foundation.org>
-
- 14 8月, 2009 11 次提交
-
-
由 Tejun Heo 提交于
With x86 converted to embedding allocator, lpage doesn't have any user left. Kill it along with cpa handling code. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Jan Beulich <JBeulich@novell.com>
-
由 Tejun Heo 提交于
Now that percpu core can handle very sparse units, given that vmalloc space is large enough, embedding first chunk allocator can use any memory to build the first chunk. This patch teaches pcpu_embed_first_chunk() about distances between cpus and to use alloc/free callbacks to allocate node specific areas for each group and use them for the first chunk. This brings the benefits of embedding allocator to NUMA configurations - no extra TLB pressure with the flexibility of unified dynamic allocator and no need to restructure arch code to build memory layout suitable for percpu. With units put into atom_size aligned groups according to cpu distances, using large page for dynamic chunks is also easily possible with falling back to reuglar pages if large allocation fails. Embedding allocator users are converted to specify NULL cpu_distance_fn, so this patch doesn't cause any visible behavior difference. Following patches will convert them. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Currently units are mapped sequentially into address space. This patch adds pcpu_unit_offsets[] which allows units to be mapped to arbitrary offsets from the chunk base address. This is necessary to allow sparse embedding which might would need to allocate address ranges and memory areas which aren't aligned to unit size but allocation atom size (page or large page size). This also simplifies things a bit by removing the need to calculate offset from unit number. With this change, there's no need for the arch code to know pcpu_unit_size. Update pcpu_setup_first_chunk() and first chunk allocators to return regular 0 or -errno return code instead of unit size or -errno. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David S. Miller <davem@davemloft.net>
-
由 Tejun Heo 提交于
Till now, non-linear cpu->unit map was expressed using an integer array which maps each cpu to a unit and used only by lpage allocator. Although how many units have been placed in a single contiguos area (group) is known while building unit_map, the information is lost when the result is recorded into the unit_map array. For lpage allocator, as all allocations are done by lpages and whether two adjacent lpages are in the same group or not is irrelevant, this didn't cause any problem. Non-linear cpu->unit mapping will be used for sparse embedding and this grouping information is necessary for that. This patch introduces pcpu_alloc_info which contains all the information necessary for initializing percpu allocator. pcpu_alloc_info contains array of pcpu_group_info which describes how units are grouped and mapped to cpus. pcpu_group_info also has base_offset field to specify its offset from the chunk's base address. pcpu_build_alloc_info() initializes this field as if all groups are allocated back-to-back as is currently done but this will be used to sparsely place groups. pcpu_alloc_info is a rather complex data structure which contains a flexible array which in turn points to nested cpu_map arrays. * pcpu_alloc_alloc_info() and pcpu_free_alloc_info() are provided to help dealing with pcpu_alloc_info. * pcpu_lpage_build_unit_map() is updated to build pcpu_alloc_info, generalized and renamed to pcpu_build_alloc_info(). @cpu_distance_fn may be NULL indicating that all cpus are of LOCAL_DISTANCE. * pcpul_lpage_dump_cfg() is updated to process pcpu_alloc_info, generalized and renamed to pcpu_dump_alloc_info(). It now also prints which group each alloc unit belongs to. * pcpu_setup_first_chunk() now takes pcpu_alloc_info instead of the separate parameters. All first chunk allocators are updated to use pcpu_build_alloc_info() to build alloc_info and call pcpu_setup_first_chunk() with it. This has the side effect of packing units for sparse possible cpus. ie. if cpus 0, 2 and 4 are possible, they'll be assigned unit 0, 1 and 2 instead of 0, 2 and 4. * x86 setup_pcpu_lpage() is updated to deal with alloc_info. * sparc64 setup_per_cpu_areas() is updated to build alloc_info. Although the changes made by this patch are pretty pervasive, it doesn't cause any behavior difference other than packing of sparse cpus. It mostly changes how information is passed among initialization functions and makes room for more flexibility. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net>
-
由 Tejun Heo 提交于
Unit map handling will be generalized and extended and used for embedding sparse first chunk and other purposes. Relocate two unit_map related functions upward in preparation. This patch just moves the code without any actual change. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
pcpu_fc_alloc_fn_t is about to see more interesting usage, add @align parameter. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Now that all actual first chunk allocation and copying happen in the first chunk allocators and helpers, there's no reason for pcpu_setup_first_chunk() to try to determine @dyn_size automatically. The only left user is page first chunk allocator. Make it determine dyn_size like other allocators and make @dyn_size mandatory for pcpu_setup_first_chunk(). Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
First chunk allocators assume percpu areas have been linked using one of PERCPU_*() macros and depend on __per_cpu_load symbol defined by those macros, so there isn't much point in passing in static area size explicitly when it can be easily calculated from __per_cpu_start and __per_cpu_end. Drop @static_size from all percpu first chunk allocators and helpers. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Now that all first chunk allocators are in mm/percpu.c, it makes sense to make generalize percpu_alloc kernel parameter. Define PCPU_FC_* and set pcpu_chosen_fc using early_param() in mm/percpu.c. Arch code can use the set value to determine which first chunk allocator to use. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
There's no need to build unused first chunk allocators in. Define CONFIG_NEED_PER_CPU_*_FIRST_CHUNK and let archs enable them selectively. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Page size isn't always 4k depending on arch and configuration. Rename 4k first chunk allocator to page. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David Howells <dhowells@redhat.com>
-
- 15 7月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
!CONFIG_SMP was missing pcpu_lpage_remapped() definition causing build failure. Add dummy implementation. This was discovered by linux-next testing. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
-
- 04 7月, 2009 7 次提交
-
-
由 Tejun Heo 提交于
Large page first chunk allocator is primarily used for NUMA machines; however, its NUMA handling is extremely simplistic. Regardless of their proximity, each cpu is put into separate large page just to return most of the allocated space back wasting large amount of vmalloc space and increasing cache footprint. This patch teachs NUMA details to large page allocator. Given processor proximity information, pcpu_lpage_build_unit_map() will find fitting cpu -> unit mapping in which cpus in LOCAL_DISTANCE share the same large page and not too much virtual address space is wasted. This greatly reduces the unit and thus chunk size and wastes much less address space for the first chunk. For example, on 4/4 NUMA machine, the original code occupied 16MB of virtual space for the first chunk while the new code only uses 4MB - one 2MB page for each node. [ Impact: much better space efficiency on NUMA machines ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jan Beulich <JBeulich@novell.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Miller <davem@davemloft.net>
-
由 Tejun Heo 提交于
Currently cpu and unit are always identity mapped. To allow more efficient large page support on NUMA and lazy allocation for possible but offline cpus, cpu -> unit mapping needs to be non-linear and/or sparse. This can be easily implemented by adding a cpu -> unit mapping array and using it whenever looking up the matching unit for a cpu. The only unusal conversion is in pcpu_chunk_addr_search(). The passed in address is unit0 based and unit0 might not be in use so it needs to be converted to address of an in-use unit. This is easily done by adding the unit offset for the current processor. [ Impact: allows non-linear/sparse cpu -> unit mapping, no visible change yet ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net>
-
由 Tejun Heo 提交于
percpu core doesn't need to tack all the allocated pages. It needs to know whether certain pages are populated and a way to reverse map address to page when freeing. This patch drops pcpu_chunk->page[] and use populated bitmap and vmalloc_to_page() lookup instead. Using vmalloc_to_page() exclusively is also possible but complicates first chunk handling, inflates cache footprint and prevents non-standard memory allocation for percpu memory. pcpu_chunk->page[] was used to track each page's allocation and allowed asymmetric population which happens during failure path; however, with single bitmap for all units, this is no longer possible. Bite the bullet and rewrite (de)populate functions so that things are done in clearly separated steps such that asymmetric population doesn't happen. This makes the (de)population process much more modular and will also ease implementing non-standard memory usage in the future (e.g. large pages). This makes @get_page_fn parameter to pcpu_setup_first_chunk() unnecessary. The parameter is dropped and all first chunk helpers are updated accordingly. Please note that despite the volume most changes to first chunk helpers are symbol renames for variables which don't need to be referenced outside of the helper anymore. This change reduces memory usage and cache footprint of pcpu_chunk. Now only #unit_pages bits are necessary per chunk. [ Impact: reduced memory usage and cache footprint for bookkeeping ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net>
-
由 Tejun Heo 提交于
Now that all first chunk allocator helpers allocate and map the first chunk themselves, there's no need to have optional default alloc/map in pcpu_setup_first_chunk(). Drop @populate_pte_fn and only leave @dyn_size optional and make all other params mandatory. This makes it much easier to follow what pcpu_setup_first_chunk() is doing and what actual differences tweaking each parameter results in. [ Impact: drop unused code path ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
Generalize and move x86 setup_pcpu_lpage() into pcpu_lpage_first_chunk(). setup_pcpu_lpage() now is a simple wrapper around the generalized version. Other than taking size parameters and using arch supplied callbacks to allocate/free/map memory, pcpu_lpage_first_chunk() is identical to the original implementation. This simplifies arch code and will help converting more archs to dynamic percpu allocator. While at it, factor out pcpu_calc_fc_sizes() which is common to pcpu_embed_first_chunk() and pcpu_lpage_first_chunk(). [ Impact: code reorganization and generalization ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
Generalize and move x86 setup_pcpu_4k() into pcpu_4k_first_chunk(). setup_pcpu_4k() now is a simple wrapper around the generalized version. Other than taking size parameters and using arch supplied callbacks to allocate/free memory, pcpu_4k_first_chunk() is identical to the original implementation. This simplifies arch code and will help converting more archs to dynamic percpu allocator. While at it, s/pcpu_populate_pte_fn_t/pcpu_fc_populate_pte_fn_t/ for consistency. [ Impact: code reorganization and generalization ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
The only extra feature @unit_size provides is making dead space at the end of the first chunk which doesn't have any valid usecase. Drop the parameter. This will increase consistency with generalized 4k allocator. James Bottomley spotted missing conversion for the default setup_per_cpu_areas() which caused build breakage on all arcsh which use it. [ Impact: drop unused code path ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Ingo Molnar <mingo@elte.hu>
-
- 24 6月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
This patch makes most !CONFIG_HAVE_SETUP_PER_CPU_AREA archs use dynamic percpu allocator. The first chunk is allocated using embedding helper and 8k is reserved for modules. This ensures that the new allocator behaves almost identically to the original allocator as long as static percpu variables are concerned, so it shouldn't introduce much breakage. s390 and alpha use custom SHIFT_PERCPU_PTR() to work around addressing range limit the addressing model imposes. Unfortunately, this breaks if the address is specified using a variable, so for now, the two archs aren't converted. The following architectures are affected by this change. * sh * arm * cris * mips * sparc(32) * blackfin * avr32 * parisc (broken, under investigation) * m32r * powerpc(32) As this change makes the dynamic allocator the default one, CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is replaced with its invert - CONFIG_HAVE_LEGACY_PER_CPU_AREA, which is added to yet-to-be converted archs. These archs implement their own setup_per_cpu_areas() and the conversion is not trivial. * powerpc(64) * sparc(64) * ia64 * alpha * s390 Boot and batch alloc/free tests on x86_32 with debug code (x86_32 doesn't use default first chunk initialization). Compile tested on sparc(32), powerpc(32), arm and alpha. Kyle McMartin reported that this change breaks parisc. The problem is still under investigation and he is okay with pushing this patch forward and fixing parisc later. [ Impact: use dynamic allocator for most archs w/o custom percpu setup ] Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NRusty Russell <rusty@rustcorp.com.au> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NMartin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: NChristoph Lameter <cl@linux.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Mikael Starvik <starvik@axis.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Bryan Wu <cooloney@kernel.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu>
-
- 12 6月, 2009 1 次提交
-
-
由 Catalin Marinas 提交于
There are allocations for which the main pointer cannot be found but they are not memory leaks. This patch fixes some of them. For more information on false positives, see Documentation/kmemleak.txt. Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
-
- 22 4月, 2009 2 次提交
-
-
由 David Howells 提交于
Collect the DECLARE/DEFINE declarations together in linux/percpu-defs.h so that they're in one place, and give them descriptive comments, particularly the SHARED_ALIGNED variant. It would be nice to collect these in linux/percpu.h, but that's not possible without sorting out the severe #include recursion between the x86 arch headers and the general headers (and possibly other arches too). Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 David Howells 提交于
In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU() does not agree with that specified by DEFINE_PER_CPU(). This means that architectures that have a small data section references relative to a base register may throw up linkage errors due to too great a displacement between where the base register points and the per-CPU variable. On FRV, the .h declaration says that the variable is in the .sdata section, but the .c definition says it's actually in the .data section. The linker throws up the following errors: kernel/built-in.o: In function `release_task': kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o To fix this, DECLARE_PER_CPU() should simply apply the same section attribute as does DEFINE_PER_CPU(). However, this is made slightly more complex by virtue of the fact that there are several variants on DEFINE, so these need to be matched by variants on DECLARE. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 11 4月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
For the time being, move the generic percpu_*() accessors to linux/percpu.h. asm-generic/percpu.h is meant to carry generic stuff for low level stuff - declarations, definitions and pointer offset calculation and so on but not for generic interface. Signed-off-by: NIngo Molnar <mingo@elte.hu>
-