- 17 12月, 2010 2 次提交
-
-
由 Tejun Heo 提交于
- include/linux/percpu.h: this_cpu_add_return() and friends were located next to __this_cpu_add_return(). However, the overall organization is to first group by preemption safeness. Relocate this_cpu_add_return() and friends to preemption-safe area. - arch/x86/include/asm/percpu.h: Relocate percpu_add_return_op() after other more basic operations. Relocate [__]this_cpu_add_return_8() so that they're first grouped by preemption safeness. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com>
-
由 Christoph Lameter 提交于
Introduce generic support for this_cpu_add_return etc. The fallback is to realize these operations with simpler __this_cpu_ops. tj: - Reformatted __cpu_size_call_return2() to make it more consistent with its neighbors. - Dropped unnecessary temp variable ret__ from __this_cpu_generic_add_return(). Reviewed-by: NTejun Heo <tj@kernel.org> Reviewed-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: NH. Peter Anvin <hpa@zytor.com> Signed-off-by: NChristoph Lameter <cl@linux.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 02 10月, 2010 2 次提交
-
-
由 Tejun Heo 提交于
On UP, percpu allocations were redirected to kmalloc. This has the following problems. * For certain amount of allocations (determined by PERCPU_DYNAMIC_EARLY_SLOTS and PERCPU_DYNAMIC_EARLY_SIZE), percpu allocator can be used before the usual kernel memory allocator is brought online. On SMP, this is used to initialize the kernel memory allocator. * percpu allocator honors alignment upto PAGE_SIZE but kmalloc() doesn't. For example, workqueue makes use of larger alignments for cpu_workqueues. Currently, users of percpu allocators need to handle UP differently, which is somewhat fragile and ugly. Other than small amount of memory, there isn't much to lose by enabling percpu allocator on UP. It can simply use kernel memory based chunk allocation which was added for SMP archs w/o MMUs. This patch removes mm/percpu_up.c, builds mm/percpu.c on UP too and makes UP build use percpu-km. As percpu addresses and kernel addresses are always identity mapped and static percpu variables don't need any special treatment, nothing is arch dependent and mm/percpu.c implements generic setup_per_cpu_areas() for UP. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi>
-
由 Tejun Heo 提交于
In preparation of enabling percpu allocator for UP, reduce PCPU_MIN_UNIT_SIZE to 32k. On UP, the first chunk doesn't have to include static percpu variables and chunk size can be smaller which is important as UP percpu allocator will use contiguous kernel memory to populate chunks. PCPU_MIN_UNIT_SIZE also determines the maximum supported allocation size but 32k should still be enough. Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 21 9月, 2010 1 次提交
-
-
由 Peter Zijlstra 提交于
These are similar to {get,put}_cpu_var() except for dynamically allocated per-cpu memory. Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: NTejun Heo <tj@kernel.org> LKML-Reference: <20100917093009.252867712@chello.nl> Signed-off-by: NIngo Molnar <mingo@elte.hu>
-
- 08 9月, 2010 2 次提交
-
-
由 Tejun Heo 提交于
On UP, percpu allocations were redirected to kmalloc. This has the following problems. * For certain amount of allocations (determined by PERCPU_DYNAMIC_EARLY_SLOTS and PERCPU_DYNAMIC_EARLY_SIZE), percpu allocator can be used before the usual kernel memory allocator is brought online. On SMP, this is used to initialize the kernel memory allocator. * percpu allocator honors alignment upto PAGE_SIZE but kmalloc() doesn't. For example, workqueue makes use of larger alignments for cpu_workqueues. Currently, users of percpu allocators need to handle UP differently, which is somewhat fragile and ugly. Other than small amount of memory, there isn't much to lose by enabling percpu allocator on UP. It can simply use kernel memory based chunk allocation which was added for SMP archs w/o MMUs. This patch removes mm/percpu_up.c, builds mm/percpu.c on UP too and makes UP build use percpu-km. As percpu addresses and kernel addresses are always identity mapped and static percpu variables don't need any special treatment, nothing is arch dependent and mm/percpu.c implements generic setup_per_cpu_areas() for UP. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Acked-by: NPekka Enberg <penberg@cs.helsinki.fi>
-
由 Tejun Heo 提交于
In preparation of enabling percpu allocator for UP, reduce PCPU_MIN_UNIT_SIZE to 32k. On UP, the first chunk doesn't have to include static percpu variables and chunk size can be smaller which is important as UP percpu allocator will use contiguous kernel memory to populate chunks. PCPU_MIN_UNIT_SIZE also determines the maximum supported allocation size but 32k should still be enough. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NChristoph Lameter <cl@linux.com>
-
- 07 8月, 2010 1 次提交
-
-
由 Namhyung Kim 提交于
UP accessors didn't take care of __percpu notations leading to a lot of spurious sparse warnings on UP configurations. Fix it. Signed-off-by: NNamhyung Kim <namhyung@gmail.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 28 6月, 2010 2 次提交
-
-
由 Tejun Heo 提交于
This patch updates percpu allocator such that it can serve limited amount of allocation before slab comes online. This is primarily to allow slab to depend on working percpu allocator. Two parameters, PERCPU_DYNAMIC_EARLY_SIZE and SLOTS, determine how much memory space and allocation map slots are reserved. If this reserved area is exhausted, WARN_ON_ONCE() will trigger and allocation will fail till slab comes online. The following changes are made to implement early alloc. * pcpu_mem_alloc() now checks slab_is_available() * Chunks are allocated using pcpu_mem_alloc() * Init paths make sure ai->dyn_size is at least as large as PERCPU_DYNAMIC_EARLY_SIZE. * Initial alloc maps are allocated in __initdata and copied to kmalloc'd areas once slab is online. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org>
-
由 Tejun Heo 提交于
In pcpu_build_alloc_info() and pcpu_embed_first_chunk(), @dyn_size was ssize_t, -1 meant auto-size, 0 forced 0 and positive meant minimum size. There's no use case for forcing 0 and the upcoming early alloc support always requires non-zero dynamic size. Make @dyn_size always mean minimum dyn_size. While at it, make pcpu_build_alloc_info() static which doesn't have any external caller as suggested by David Rientjes. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com>
-
- 30 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
percpu.h has always been including slab.h to get k[mz]alloc/free() for UP inline implementation. percpu.h being used by very low level headers including module.h and sched.h, this meant that a lot files unintentionally got slab.h inclusion. Lee Schermerhorn was trying to make topology.h use percpu.h and got bitten by this implicit inclusion. The right thing to do is break this ultimately unnecessary dependency. The previous patch added explicit inclusion of either gfp.h or slab.h to the source files using them. This patch updates percpu.h such that slab.h is no longer included from percpu.h. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
-
- 29 3月, 2010 1 次提交
-
-
由 Tejun Heo 提交于
lockdep has custom code to check whether a pointer belongs to static percpu area which is somewhat broken. Implement proper is_kernel/module_percpu_address() and replace the custom code. On UP, percpu variables are regular static variables and can't be distinguished from them. Always return %false on UP. Signed-off-by: NTejun Heo <tj@kernel.org> Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com>
-
- 02 12月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
Commit 3b034b0d implemented per_cpu_ptr_to_phys() but forgot to add UP definition. Add UP definition which is simple wrapper around __pa(). Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Reported-by: NRandy Dunlap <randy.dunlap@oracle.com>
-
- 25 11月, 2009 1 次提交
-
-
由 Vivek Goyal 提交于
o kdump functionality reserves a per cpu area at boot time and exports the physical address of that area to user space through sys interface. This area stores some dump related information like cpu register states etc at the time of crash. o We were assuming that per cpu area always come from linearly mapped meory region and using __pa() to determine physical address. With percpu_alloc=page, per cpu area can come from vmalloc region also and __pa() breaks. o This patch implments a new function to convert per cpu address to physical address. Before the patch, crash_notes addresses looked as follows. cpu0 60fffff49800 cpu1 60fffff60800 cpu2 60fffff77800 These are bogus phsyical addresses. After the patch, address are following. cpu0 13eb44000 cpu1 13eb43000 cpu2 13eb42000 cpu3 13eb41000 These look fine. I got 4G of memory and /proc/iomem tell me following. 100000000-13fffffff : System RAM tj: * added missing asm/io.h include reported by Stephen Rothwell * repositioned per_cpu_ptr_phys() in percpu.c and added comment. Signed-off-by: NVivek Goyal <vgoyal@redhat.com> Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
-
- 29 10月, 2009 6 次提交
-
-
由 Tejun Heo 提交于
The previous patch made sparse warn about percpu variables being used directly without going through percpu accessors. This patch implements the other half - checking whether non percpu variable is passed into percpu accessors. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Al Viro <viro@zeniv.linux.org.uk>
-
由 Rusty Russell 提交于
We have to make __kernel "__attribute__((address_space(0)))" so we can cast to it. tj: * put_cpu_var() update. * Annotations added to dynamic allocator interface. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Now that per_cpu__ prefix is gone, there's no distinction between static and dynamic percpu variables. Make get_cpu_var() take dynamic percpu variables and ensure that all macros have parentheses around the parameter evaluation and evaluate the variable parameter only once such that any expression which evaluates to percpu address can be used safely. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Rusty Russell 提交于
Now that the return from alloc_percpu is compatible with the address of per-cpu vars, it makes sense to hand around the address of per-cpu variables. To make this sane, we remove the per_cpu__ prefix we used created to stop people accidentally using these vars directly. Now we have sparse, we can use that (next patch). tj: * Updated to convert stuff which were missed by or added after the original patch. * Kill per_cpu_var() macro. Signed-off-by: NRusty Russell <rusty@rustcorp.com.au> Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
-
由 Tejun Heo 提交于
Make the following changes to remove some sparse warnings. * Make DEFINE_PER_CPU_SECTION() declare __pcpu_unique_* before defining it. * Annotate pcpu_extend_area_map() that it is entered with pcpu_lock held, releases it and then reacquires it. * Make percpu related macros use unique nested variable names. * While at it, add pcpu prefix to __size_call[_return]() macros as to-be-implemented sparse annotations will add percpu specific stuff to these macros. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au>
-
由 Tejun Heo 提交于
alloc_percpu() couldn't handle array types like "int [100]" due to the way return type was casted. Fix it by using typeof() instead. Signed-off-by: NTejun Heo <tj@kernel.org> Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com> Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
-
- 03 10月, 2009 1 次提交
-
-
由 Christoph Lameter 提交于
This patch introduces two things: First this_cpu_ptr and then per cpu atomic operations. this_cpu_ptr ------------ A common operation when dealing with cpu data is to get the instance of the cpu data associated with the currently executing processor. This can be optimized by this_cpu_ptr(xx) = per_cpu_ptr(xx, smp_processor_id). The problem with per_cpu_ptr(x, smp_processor_id) is that it requires an array lookup to find the offset for the cpu. Processors typically have the offset for the current cpu area in some kind of (arch dependent) efficiently accessible register or memory location. We can use that instead of doing the array lookup to speed up the determination of the address of the percpu variable. This is particularly significant because these lookups occur in performance critical paths of the core kernel. this_cpu_ptr() can avoid memory accesses and this_cpu_ptr comes in two flavors. The preemption context matters since we are referring the the currently executing processor. In many cases we must insure that the processor does not change while a code segment is executed. __this_cpu_ptr -> Do not check for preemption context this_cpu_ptr -> Check preemption context The parameter to these operations is a per cpu pointer. This can be the address of a statically defined per cpu variable (&per_cpu_var(xxx)) or the address of a per cpu variable allocated with the per cpu allocator. per cpu atomic operations: this_cpu_*(var, val) ----------------------------------------------- this_cpu_* operations (like this_cpu_add(struct->y, value) operate on abitrary scalars that are members of structures allocated with the new per cpu allocator. They can also operate on static per_cpu variables if they are passed to per_cpu_var() (See patch to use this_cpu_* operations for vm statistics). These operations are guaranteed to be atomic vs preemption when modifying the scalar. The calculation of the per cpu offset is also guaranteed to be atomic at the same time. This means that a this_cpu_* operation can be safely used to modify a per cpu variable in a context where interrupts are enabled and preemption is allowed. Many architectures can perform such a per cpu atomic operation with a single instruction. Note that the atomicity here is different from regular atomic operations. Atomicity is only guaranteed for data accessed from the currently executing processor. Modifications from other processors are still possible. There must be other guarantees that the per cpu data is not modified from another processor when using these instruction. The per cpu atomicity is created by the fact that the processor either executes and instruction or not. Embedded in the instruction is the relocation of the per cpu address to the are reserved for the current processor and the RMW action. Therefore interrupts or preemption cannot occur in the mids of this processing. Generic fallback functions are used if an arch does not define optimized this_cpu operations. The functions come also come in the two flavors used for this_cpu_ptr(). The firstparameter is a scalar that is a member of a structure allocated through allocpercpu or a per cpu variable (use per_cpu_var(xxx)). The operations are similar to what percpu_add() and friends do. this_cpu_read(scalar) this_cpu_write(scalar, value) this_cpu_add(scale, value) this_cpu_sub(scalar, value) this_cpu_inc(scalar) this_cpu_dec(scalar) this_cpu_and(scalar, value) this_cpu_or(scalar, value) this_cpu_xor(scalar, value) Arch code can override the generic functions and provide optimized atomic per cpu operations. These atomic operations must provide both the relocation (x86 does it through a segment override) and the operation on the data in a single instruction. Otherwise preempt needs to be disabled and there is no gain from providing arch implementations. A third variant is provided prefixed by irqsafe_. These variants are safe against hardware interrupts on the *same* processor (all per cpu atomic primitives are *always* *only* providing safety for code running on the *same* processor!). The increment needs to be implemented by the hardware in such a way that it is a single RMW instruction that is either processed before or after an interrupt. cc: David Howells <dhowells@redhat.com> cc: Ingo Molnar <mingo@elte.hu> cc: Rusty Russell <rusty@rustcorp.com.au> cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: NChristoph Lameter <cl@linux-foundation.org> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 02 10月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
With ia64 converted, there's no arch left which still uses legacy percpu allocator. Kill it. Signed-off-by: NTejun Heo <tj@kernel.org> Delightedly-acked-by: NRusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Lameter <cl@linux-foundation.org>
-
- 14 8月, 2009 11 次提交
-
-
由 Tejun Heo 提交于
With x86 converted to embedding allocator, lpage doesn't have any user left. Kill it along with cpa handling code. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Jan Beulich <JBeulich@novell.com>
-
由 Tejun Heo 提交于
Now that percpu core can handle very sparse units, given that vmalloc space is large enough, embedding first chunk allocator can use any memory to build the first chunk. This patch teaches pcpu_embed_first_chunk() about distances between cpus and to use alloc/free callbacks to allocate node specific areas for each group and use them for the first chunk. This brings the benefits of embedding allocator to NUMA configurations - no extra TLB pressure with the flexibility of unified dynamic allocator and no need to restructure arch code to build memory layout suitable for percpu. With units put into atom_size aligned groups according to cpu distances, using large page for dynamic chunks is also easily possible with falling back to reuglar pages if large allocation fails. Embedding allocator users are converted to specify NULL cpu_distance_fn, so this patch doesn't cause any visible behavior difference. Following patches will convert them. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Currently units are mapped sequentially into address space. This patch adds pcpu_unit_offsets[] which allows units to be mapped to arbitrary offsets from the chunk base address. This is necessary to allow sparse embedding which might would need to allocate address ranges and memory areas which aren't aligned to unit size but allocation atom size (page or large page size). This also simplifies things a bit by removing the need to calculate offset from unit number. With this change, there's no need for the arch code to know pcpu_unit_size. Update pcpu_setup_first_chunk() and first chunk allocators to return regular 0 or -errno return code instead of unit size or -errno. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David S. Miller <davem@davemloft.net>
-
由 Tejun Heo 提交于
Till now, non-linear cpu->unit map was expressed using an integer array which maps each cpu to a unit and used only by lpage allocator. Although how many units have been placed in a single contiguos area (group) is known while building unit_map, the information is lost when the result is recorded into the unit_map array. For lpage allocator, as all allocations are done by lpages and whether two adjacent lpages are in the same group or not is irrelevant, this didn't cause any problem. Non-linear cpu->unit mapping will be used for sparse embedding and this grouping information is necessary for that. This patch introduces pcpu_alloc_info which contains all the information necessary for initializing percpu allocator. pcpu_alloc_info contains array of pcpu_group_info which describes how units are grouped and mapped to cpus. pcpu_group_info also has base_offset field to specify its offset from the chunk's base address. pcpu_build_alloc_info() initializes this field as if all groups are allocated back-to-back as is currently done but this will be used to sparsely place groups. pcpu_alloc_info is a rather complex data structure which contains a flexible array which in turn points to nested cpu_map arrays. * pcpu_alloc_alloc_info() and pcpu_free_alloc_info() are provided to help dealing with pcpu_alloc_info. * pcpu_lpage_build_unit_map() is updated to build pcpu_alloc_info, generalized and renamed to pcpu_build_alloc_info(). @cpu_distance_fn may be NULL indicating that all cpus are of LOCAL_DISTANCE. * pcpul_lpage_dump_cfg() is updated to process pcpu_alloc_info, generalized and renamed to pcpu_dump_alloc_info(). It now also prints which group each alloc unit belongs to. * pcpu_setup_first_chunk() now takes pcpu_alloc_info instead of the separate parameters. All first chunk allocators are updated to use pcpu_build_alloc_info() to build alloc_info and call pcpu_setup_first_chunk() with it. This has the side effect of packing units for sparse possible cpus. ie. if cpus 0, 2 and 4 are possible, they'll be assigned unit 0, 1 and 2 instead of 0, 2 and 4. * x86 setup_pcpu_lpage() is updated to deal with alloc_info. * sparc64 setup_per_cpu_areas() is updated to build alloc_info. Although the changes made by this patch are pretty pervasive, it doesn't cause any behavior difference other than packing of sparse cpus. It mostly changes how information is passed among initialization functions and makes room for more flexibility. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net>
-
由 Tejun Heo 提交于
Unit map handling will be generalized and extended and used for embedding sparse first chunk and other purposes. Relocate two unit_map related functions upward in preparation. This patch just moves the code without any actual change. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
pcpu_fc_alloc_fn_t is about to see more interesting usage, add @align parameter. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Now that all actual first chunk allocation and copying happen in the first chunk allocators and helpers, there's no reason for pcpu_setup_first_chunk() to try to determine @dyn_size automatically. The only left user is page first chunk allocator. Make it determine dyn_size like other allocators and make @dyn_size mandatory for pcpu_setup_first_chunk(). Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
First chunk allocators assume percpu areas have been linked using one of PERCPU_*() macros and depend on __per_cpu_load symbol defined by those macros, so there isn't much point in passing in static area size explicitly when it can be easily calculated from __per_cpu_start and __per_cpu_end. Drop @static_size from all percpu first chunk allocators and helpers. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Now that all first chunk allocators are in mm/percpu.c, it makes sense to make generalize percpu_alloc kernel parameter. Define PCPU_FC_* and set pcpu_chosen_fc using early_param() in mm/percpu.c. Arch code can use the set value to determine which first chunk allocator to use. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
There's no need to build unused first chunk allocators in. Define CONFIG_NEED_PER_CPU_*_FIRST_CHUNK and let archs enable them selectively. Signed-off-by: NTejun Heo <tj@kernel.org>
-
由 Tejun Heo 提交于
Page size isn't always 4k depending on arch and configuration. Rename 4k first chunk allocator to page. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: David Howells <dhowells@redhat.com>
-
- 15 7月, 2009 1 次提交
-
-
由 Tejun Heo 提交于
!CONFIG_SMP was missing pcpu_lpage_remapped() definition causing build failure. Add dummy implementation. This was discovered by linux-next testing. Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
-
- 04 7月, 2009 6 次提交
-
-
由 Tejun Heo 提交于
Large page first chunk allocator is primarily used for NUMA machines; however, its NUMA handling is extremely simplistic. Regardless of their proximity, each cpu is put into separate large page just to return most of the allocated space back wasting large amount of vmalloc space and increasing cache footprint. This patch teachs NUMA details to large page allocator. Given processor proximity information, pcpu_lpage_build_unit_map() will find fitting cpu -> unit mapping in which cpus in LOCAL_DISTANCE share the same large page and not too much virtual address space is wasted. This greatly reduces the unit and thus chunk size and wastes much less address space for the first chunk. For example, on 4/4 NUMA machine, the original code occupied 16MB of virtual space for the first chunk while the new code only uses 4MB - one 2MB page for each node. [ Impact: much better space efficiency on NUMA machines ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jan Beulich <JBeulich@novell.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Miller <davem@davemloft.net>
-
由 Tejun Heo 提交于
Currently cpu and unit are always identity mapped. To allow more efficient large page support on NUMA and lazy allocation for possible but offline cpus, cpu -> unit mapping needs to be non-linear and/or sparse. This can be easily implemented by adding a cpu -> unit mapping array and using it whenever looking up the matching unit for a cpu. The only unusal conversion is in pcpu_chunk_addr_search(). The passed in address is unit0 based and unit0 might not be in use so it needs to be converted to address of an in-use unit. This is easily done by adding the unit offset for the current processor. [ Impact: allows non-linear/sparse cpu -> unit mapping, no visible change yet ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net>
-
由 Tejun Heo 提交于
percpu core doesn't need to tack all the allocated pages. It needs to know whether certain pages are populated and a way to reverse map address to page when freeing. This patch drops pcpu_chunk->page[] and use populated bitmap and vmalloc_to_page() lookup instead. Using vmalloc_to_page() exclusively is also possible but complicates first chunk handling, inflates cache footprint and prevents non-standard memory allocation for percpu memory. pcpu_chunk->page[] was used to track each page's allocation and allowed asymmetric population which happens during failure path; however, with single bitmap for all units, this is no longer possible. Bite the bullet and rewrite (de)populate functions so that things are done in clearly separated steps such that asymmetric population doesn't happen. This makes the (de)population process much more modular and will also ease implementing non-standard memory usage in the future (e.g. large pages). This makes @get_page_fn parameter to pcpu_setup_first_chunk() unnecessary. The parameter is dropped and all first chunk helpers are updated accordingly. Please note that despite the volume most changes to first chunk helpers are symbol renames for variables which don't need to be referenced outside of the helper anymore. This change reduces memory usage and cache footprint of pcpu_chunk. Now only #unit_pages bits are necessary per chunk. [ Impact: reduced memory usage and cache footprint for bookkeeping ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net>
-
由 Tejun Heo 提交于
Now that all first chunk allocator helpers allocate and map the first chunk themselves, there's no need to have optional default alloc/map in pcpu_setup_first_chunk(). Drop @populate_pte_fn and only leave @dyn_size optional and make all other params mandatory. This makes it much easier to follow what pcpu_setup_first_chunk() is doing and what actual differences tweaking each parameter results in. [ Impact: drop unused code path ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
Generalize and move x86 setup_pcpu_lpage() into pcpu_lpage_first_chunk(). setup_pcpu_lpage() now is a simple wrapper around the generalized version. Other than taking size parameters and using arch supplied callbacks to allocate/free/map memory, pcpu_lpage_first_chunk() is identical to the original implementation. This simplifies arch code and will help converting more archs to dynamic percpu allocator. While at it, factor out pcpu_calc_fc_sizes() which is common to pcpu_embed_first_chunk() and pcpu_lpage_first_chunk(). [ Impact: code reorganization and generalization ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu>
-
由 Tejun Heo 提交于
Generalize and move x86 setup_pcpu_4k() into pcpu_4k_first_chunk(). setup_pcpu_4k() now is a simple wrapper around the generalized version. Other than taking size parameters and using arch supplied callbacks to allocate/free memory, pcpu_4k_first_chunk() is identical to the original implementation. This simplifies arch code and will help converting more archs to dynamic percpu allocator. While at it, s/pcpu_populate_pte_fn_t/pcpu_fc_populate_pte_fn_t/ for consistency. [ Impact: code reorganization and generalization ] Signed-off-by: NTejun Heo <tj@kernel.org> Cc: Ingo Molnar <mingo@elte.hu>
-