- 02 12月, 2014 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
Rename invalidate_old_hpte to flush_hash_hugepage and use that in other places. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 03 11月, 2014 1 次提交
-
-
由 Christoph Lameter 提交于
This still has not been merged and now powerpc is the only arch that does not have this change. Sorry about missing linuxppc-dev before. V2->V2 - Fix up to work against 3.18-rc1 __get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> Signed-off-by: NChristoph Lameter <cl@linux.com> [mpe: Fix build errors caused by set/or_softirq_pending(), and rework assignment in __set_breakpoint() to use memcpy().] Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
- 27 8月, 2014 2 次提交
-
-
由 Tejun Heo 提交于
This reverts commit 5828f666 due to build failure after merging with pending powerpc changes. Link: http://lkml.kernel.org/g/20140827142243.6277eaff@canb.auug.org.auSigned-off-by: NTejun Heo <tj@kernel.org> Reported-by: NStephen Rothwell <sfr@canb.auug.org.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Christoph Lameter 提交于
__get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : #define __get_cpu_var(var) (*this_cpu_ptr(&(var))) __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calculations are avoided and less registers are used when code is generated. At the end of the patch set all uses of __get_cpu_var have been removed so the macro is removed too. The patch set includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, y); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(&x, this_cpu_ptr(&y), sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to __this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to __this_cpu_inc(y) tj: Folded a fix patch. http://lkml.kernel.org/g/alpine.DEB.2.11.1408172143020.9652@gentwo.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> Signed-off-by: NChristoph Lameter <cl@linux.com> Signed-off-by: NTejun Heo <tj@kernel.org>
-
- 21 6月, 2013 1 次提交
-
-
由 Aneesh Kumar K.V 提交于
We now have pmd entries covering 16MB range and the PMD table double its original size. We use the second half of the PMD table to deposit the pgtable (PTE page). The depoisted PTE page is further used to track the HPTE information. The information include [ secondary group | 3 bit hidx | valid ]. We use one byte per each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and with 4K HPTE we need 4096 entries. Both will fit in a 4K PTE page. On hugepage invalidate we need to walk the PTE page and invalidate all valid HPTEs. This patch implements necessary arch specific functions for THP support and also hugepage invalidate logic. These PMD related functions are intentionally kept similar to their PTE counter-part. Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 17 9月, 2012 2 次提交
-
-
由 Aneesh Kumar K.V 提交于
slice array size and slice mask size depend on PGTABLE_RANGE. Reviewed-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
由 Aneesh Kumar K.V 提交于
This patch convert different functions to take virtual page number instead of virtual address. Virtual page number is virtual address shifted right by VPN_SHIFT (12) bits. This enable us to have an address range of upto 76 bits. Reviewed-by: NPaul Mackerras <paulus@samba.org> Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 20 4月, 2011 1 次提交
-
-
由 Michael Ellerman 提交于
Use MMU_NO_CONTEXT as the initialiser for mm_context.id on nohash and hash64. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 20 8月, 2009 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
We need to pass down whether the page is direct or indirect and we'll need to pass the page size to _tlbil_va and _tlbivax_bcast We also add a new low level _tlbil_pid_noind() which does a TLB flush by PID but avoids flushing indirect entries if possible This implements those new prototypes but defines them with inlines or macros so that no additional arguments are actually passed on current processors. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 21 12月, 2008 3 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Currently, the various forms of low level TLB invalidations are all implemented in misc_32.S for 32-bit processors, in a fairly scary mess of #ifdef's and with interesting duplication such as a whole bunch of code for FSL _tlbie and _tlbia which are no longer used. This moves things around such that _tlbie is now defined in hash_low_32.S and is only used by the 32-bit hash code, and all nohash CPUs use the various _tlbil_* forms that are now moved to a new file, tlb_nohash_low.S. I moved all the definitions for that stuff out of include/asm/tlbflush.h as they are really internal mm stuff, into mm/mmu_decl.h The code should have no functional changes. I kept some variants inline for trivial forms on things like 40x and 8xx. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NKumar Gala <galak@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Benjamin Herrenschmidt 提交于
This commit moves the whole no-hash TLB handling out of line into a new tlb_nohash.c file, and implements some basic SMP support using IPIs and/or broadcast tlbivax instructions. Note that I'm using local invalidations for D->I cache coherency. At worst, if another processor is trying to execute the same and has the old entry in its TLB, it will just take a fault and re-do the TLB flush locally (it won't re-do the cache flush in any case). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NKumar Gala <galak@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Benjamin Herrenschmidt 提交于
This reworks the context management code used by 4xx,8xx and freescale BookE. It adds support for SMP by implementing a concept of stale context map to lazily flush the TLB on processors where a context may have been invalidated. This also contains the ground work for generalizing such lazy TLB flushing by just picking up a new PID and marking the old one stale. This will be implemented later. This is a first implementation that uses a global spinlock. Ideally, we should try to get at least the fast path (context ID already assigned) lockless or limited to a per context lock, but for now this will do. I tried to keep the UP case reasonably simple to avoid adding too much overhead to 8xx which does a lot of context stealing since it effectively has only 16 PIDs available. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NKumar Gala <galak@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 16 12月, 2008 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
This adds a local_flush_tlb_mm() call as a pre-requisite for some SMP work for BookE processors. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NKumar Gala <galak@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 03 12月, 2008 1 次提交
-
-
由 Kumar Gala 提交于
The tlb invalidates in kmap_atomic/kunmap_atomic can be called from IRQ context, however they are only local invalidates (on the processor that the kmap was called on). In the future we want to use IPIs to do tlb invalidates this causes issue since flush_tlb_page() is considered a broadcast invalidate. Add local_flush_tlb_page() as a non-broadcast invalidate and use it in kmap_atomic() since we don't have enough information in the flush_tlb_page() call to determine its local. Signed-off-by: NKumar Gala <galak@kernel.crashing.org> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 25 9月, 2008 1 次提交
-
-
由 Kumar Gala 提交于
Introduced a new set of low level tlb invalidate functions that do not broadcast invalidates on the bus: _tlbil_all - invalidate all _tlbil_pid - invalidate based on process id (or mm context) _tlbil_va - invalidate based on virtual address (ea + pid) On non-SMP configs _tlbil_all should be functionally equivalent to _tlbia and _tlbil_va should be functionally equivalent to _tlbie. The intent of this change is to handle SMP based invalidates via IPIs instead of broadcasts as the mechanism scales better for larger number of cores. On e500 (fsl-booke mmu) based cores move to using MMUCSR for invalidate alls and tlbsx/tlbwe for invalidate virtual address. Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 04 8月, 2008 1 次提交
-
-
由 Stephen Rothwell 提交于
from include/asm-powerpc. This is the result of a mkdir arch/powerpc/include/asm git mv include/asm-powerpc/* arch/powerpc/include/asm Followed by a few documentation/comment fixups and a couple of places where <asm-powepc/...> was being used explicitly. Of the latter only one was outside the arch code and it is a driver only built for powerpc. Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 25 7月, 2008 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
where it belongs. This fixes some build problems on some configs Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
-
- 09 11月, 2007 1 次提交
-
-
由 Kumar Gala 提交于
kmap_atomic calls flush_tlb_page with a NULL VMA and thus we end up dereferencing a NULL pointer to try and get the context.id. If the VMA is null use the global pid value of 0. Signed-off-by: NKumar Gala <galak@kernel.crashing.org>
-
- 01 11月, 2007 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
On 4xx CPUs, the current implementation of flush_tlb_page() uses a low level _tlbie() assembly function that only works for the current PID. Thus, invalidations caused by, for example, a COW fault triggered by get_user_pages() from a different context will not work properly, causing among other things, gdb breakpoints to fail. This patch adds a "pid" argument to _tlbie() on 4xx processors, and uses it to flush entries in the right context. FSL BookE also gets the argument but it seems they don't need it (their tlbivax form ignores the PID when invalidating according to the document I have). Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: NKumar Gala <galak@kernel.crashing.org> Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
-
- 20 10月, 2007 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Nobody uses flush_tlb_pgtables anymore, this patch removes all remaining traces of it from all archs. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 12 10月, 2007 1 次提交
-
-
由 Paul Mackerras 提交于
This makes the kernel use 1TB segments for all kernel mappings and for user addresses of 1TB and above, on machines which support them (currently POWER5+, POWER6 and PA6T). We detect that the machine supports 1TB segments by looking at the ibm,processor-segment-sizes property in the device tree. We don't currently use 1TB segments for user addresses < 1T, since that would effectively prevent 32-bit processes from using huge pages unless we also had a way to revert to using 256MB segments. That would be possible but would involve extra complications (such as keeping track of which segment size was used when HPTEs were inserted) and is not addressed here. Parts of this patch were originally written by Ben Herrenschmidt. Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 14 6月, 2007 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
This rewrites pretty much from scratch the handling of MMIO and PIO space allocations on powerpc64. The main goals are: - Get rid of imalloc and use more common code where possible - Simplify the current mess so that PIO space is allocated and mapped in a single place for PCI bridges - Handle allocation constraints of PIO for all bridges including hot plugged ones within the 2GB space reserved for IO ports, so that devices on hotplugged busses will now work with drivers that assume IO ports fit in an int. - Cleanup and separate tracking of the ISA space in the reserved low 64K of IO space. No ISA -> Nothing mapped there. I booted a cell blade with IDE on PIO and MMIO and a dual G5 so far, that's it :-) With this patch, all allocations are done using the code in mm/vmalloc.c, though we use the low level __get_vm_area with explicit start/stop constraints in order to manage separate areas for vmalloc/vmap, ioremap, and PCI IOs. This greatly simplifies a lot of things, as you can see in the diffstat of that patch :-) A new pair of functions pcibios_map/unmap_io_space() now replace all of the previous code that used to manipulate PCI IOs space. The allocation is done at mapping time, which is now called from scan_phb's, just before the devices are probed (instead of after, which is by itself a bug fix). The only other caller is the PCI hotplug code for hot adding PCI-PCI bridges (slots). imalloc is gone, as is the "sub-allocation" thing, but I do beleive that hotplug should still work in the sense that the space allocation is always done by the PHB, but if you unmap a child bus of this PHB (which seems to be possible), then the code should properly tear down all the HPTE mappings for that area of the PHB allocated IO space. I now always reserve the first 64K of IO space for the bridge with the ISA bus on it. I have moved the code for tracking ISA in a separate file which should also make it smarter if we ever are capable of hot unplugging or re-plugging an ISA bridge. This should have a side effect on platforms like powermac where VGA IOs will no longer work. This is done on purpose though as they would have worked semi-randomly before. The idea at this point is to isolate drivers that might need to access those and fix them by providing a proper function to obtain an offset to the legacy IOs of a given bus. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 24 4月, 2007 1 次提交
-
-
由 David Gibson 提交于
BenH's commit a741e679 in powerpc.git, although (AFAICT) only intended to affect ppc64, also has side-effects which break 44x. I think 40x, 8xx and Freescale Book E are also affected, though I haven't tested them. The problem lies in unconditionally removing flush_tlb_pending() from the versions of flush_tlb_mm(), flush_tlb_range() and flush_tlb_kernel_range() used on ppc64 - which are also used the embedded platforms mentioned above. The patch below cleans up the convoluted #ifdef logic in tlbflush.h, in the process restoring the necessary flushes for the software TLB platforms. There are three sets of definitions for the flushing hooks: the software TLB versions (revised to avoid using names which appear to related to TLB batching), the 32-bit hash based versions (external functions) amd the 64-bit hash based versions (which implement batching). It also moves the declaration of update_mmu_cache() to always be in tlbflush.h (previously it was in tlbflush.h except for PPC64, where it was in pgtable.h). Booted on Ebony (440GP) and compiled for 64-bit and 32-bit multiplatform. Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 13 4月, 2007 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
The current tlb flush code on powerpc 64 bits has a subtle race since we lost the page table lock due to the possible faulting in of new PTEs after a previous one has been removed but before the corresponding hash entry has been evicted, which can leads to all sort of fatal problems. This patch reworks the batch code completely. It doesn't use the mmu_gather stuff anymore. Instead, we use the lazy mmu hooks that were added by the paravirt code. They have the nice property that the enter/leave lazy mmu mode pair is always fully contained by the PTE lock for a given range of PTEs. Thus we can guarantee that all batches are flushed on a given CPU before it drops that lock. We also generalize batching for any PTE update that require a flush. Batching is now enabled on a CPU by arch_enter_lazy_mmu_mode() and disabled by arch_leave_lazy_mmu_mode(). The code epects that this is always contained within a PTE lock section so no preemption can happen and no PTE insertion in that range from another CPU. When batching is enabled on a CPU, every PTE updates that need a hash flush will use the batch for that flush. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 26 4月, 2006 1 次提交
-
-
由 David Woodhouse 提交于
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
-
- 07 11月, 2005 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel base page size to 64K. The resulting kernel still boots on any hardware. On current machines with 4K pages support only, the kernel will maintain 16 "subpages" for each 64K page transparently. Note that while real 64K capable HW has been tested, the current patch will not enable it yet as such hardware is not released yet, and I'm still verifying with the firmware architects the proper to get the information from the newer hypervisors. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 04 11月, 2005 1 次提交
-
-
由 Stephen Rothwell 提交于
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
-