- 11 1月, 2006 1 次提交
-
-
由 Anton Blanchard 提交于
The current ppc64 per cpu data implementation is quite slow. eg: lhz 11,18(13) /* smp_processor_id() */ ld 9,.LC63-.LCTOC1(30) /* per_cpu__variable_name */ ld 8,.LC61-.LCTOC1(30) /* __per_cpu_offset */ sldi 11,11,3 /* form index into __per_cpu_offset */ mr 10,9 ldx 9,11,8 /* __per_cpu_offset[smp_processor_id()] */ ldx 0,10,9 /* load per cpu data */ 5 loads for something that is supposed to be fast, pretty awful. One reason for the large number of loads is that we have to synthesize 2 64bit constants (per_cpu__variable_name and __per_cpu_offset). By putting __per_cpu_offset into the paca we can avoid the 2 loads associated with it: ld 11,56(13) /* paca->data_offset */ ld 9,.LC59-.LCTOC1(30) /* per_cpu__variable_name */ ldx 0,9,11 /* load per cpu data Longer term we can should be able to do even better than 3 loads. If per_cpu__variable_name wasnt a 64bit constant and paca->data_offset was in a register we could cut it down to one load. A suggestion from Rusty is to use gcc's __thread extension here. In order to do this we would need to free up r13 (the __thread register and where the paca currently is). So far Ive had a few unsuccessful attempts at doing that :) The patch also allocates per cpu memory node local on NUMA machines. This patch from Rusty has been sitting in my queue _forever_ but stalled when I hit the compiler bug. Sorry about that. Finally I also only allocate per cpu data for possible cpus, which comes straight out of the x86-64 port. On a pseries kernel (with NR_CPUS == 128) and 4 possible cpus we see some nice gains: total used free shared buffers cached Mem: 4012228 212860 3799368 0 0 162424 total used free shared buffers cached Mem: 4016200 212984 3803216 0 0 162424 A saving of 3.75MB. Quite nice for smaller machines. Note: we now have to be careful of per cpu users that touch data for !possible cpus. At this stage it might be worth making the NUMA and possible cpu optimisations generic, but per cpu init is done so early we have to be careful that all architectures have their possible map setup correctly. Signed-off-by: NAnton Blanchard <anton@samba.org> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 09 1月, 2006 3 次提交
-
-
由 Arnd Bergmann 提交于
include/asm-ppc/ had #ifdef __KERNEL__ in all header files that are not meant for use by user space, include/asm-powerpc does not have this yet. This patch gets us a lot closer there. There are a few cases where I was not sure, so I left them out. I have verified that no CONFIG_* symbols are used outside of __KERNEL__ any more and that there are no obvious compile errors when including any of the headers in user space libraries. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 David Gibson 提交于
This patch removes several unnecessary fields from the paca: - next_jiffy_update_tb was simply unused. Remove trivially. - The exdsi exception save area was not used. There were plans to use it, but they never seem to have gone anywhere. If they ever do, we can put it back. Remove from the paca, and from asm-offsets.c - The default_decr field was used from asm, but was only ever assigned the value of tb_ticks_per_jiffy. Just access tb_ticks_per_jiffy from asm directly instead. Built and booted on POWER5 LPAR and iSeries RS64. Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 David Gibson 提交于
On iSeries, the paca contains, amongst other things an ItLpRegSave structure used by the hypervisor to save registers. The hypervisor locates this area through a pointer at the beginning of the paca, so the structure itself can be located elsewhere. This patch moves the reg_save area out into its own array. This reduces the amount of iSeries specific gunk which is visible to general powerpc code via paca.h Built and booted on POWER5 LPAR and iSeries RS64. Signed-off-by: NDavid Gibson <dwg@au1.ibm.com> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 10 11月, 2005 1 次提交
-
-
由 David Gibson 提交于
This patch moves a bunch of files from arch/ppc64 and include/asm-ppc64 which have no equivalents in ppc32 code into arch/powerpc and include/asm-powerpc. The file affected are: abs_addr.h compat.h lppaca.h paca.h tce.h cpu_setup_power4.S ioctl32.c firmware.c pacaData.c The only changes apart from the move and corresponding Makefile changes are: - #ifndef/#define in includes updated to _ASM_POWERPC_ form - trailing whitespace removed - comments giving full paths removed - pacaData.c renamed paca.c to remove studlyCaps - Misplaced { moved in lppaca.h Built and booted on POWER5 LPAR (ARCH=powerpc and ARCH=ppc64), built for 32-bit powermac (ARCH=powerpc). Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 07 11月, 2005 1 次提交
-
-
由 Benjamin Herrenschmidt 提交于
Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel base page size to 64K. The resulting kernel still boots on any hardware. On current machines with 4K pages support only, the kernel will maintain 16 "subpages" for each 64K page transparently. Note that while real 64K capable HW has been tested, the current patch will not enable it yet as such hardware is not released yet, and I'm still verifying with the firmware architects the proper to get the information from the newer hypervisors. Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 02 11月, 2005 1 次提交
-
-
由 Kelly Daly 提交于
Signed-off-by: NKelly Daly <kelly@au.ibm.com>
-
- 30 6月, 2005 2 次提交
-
-
由 Michael Ellerman 提交于
Currently there's a per-cpu count of lpevents processed, a per-queue (ie. global) total count, and a count by event type. Replace all that with a count by event for each cpu. We only need to add it up int the proc code. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Acked-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
由 Michael Ellerman 提交于
The iSeries code keeps a pointer to the ItLpQueue in its paca struct. But all these pointers end up pointing to the one place, ie. xItLpQueue. So remove the pointer from the paca struct and just refer to xItLpQueue directly where needed. The only complication is that the spread_lpevents logic was implemented by having a NULL lpqueue pointer in the paca on CPUs that weren't supposed to process events. Instead we just compare the spread_lpevents value to the processor id to get the same behaviour. Signed-off-by: NMichael Ellerman <michael@ellerman.id.au> Acked-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NPaul Mackerras <paulus@samba.org>
-
- 22 6月, 2005 1 次提交
-
-
由 Stephen Rothwell 提交于
This patch removes some unused bits from HvCall.h and some unneeded #includes from other files. Also includes ItLpQueue.h in paca.h in preference to a stub declaration of struct ItLpQueue. Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@osdl.org> Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
-
- 17 4月, 2005 1 次提交
-
-
由 Linus Torvalds 提交于
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
-