提交 · d0f13e3c20b6fb73ccb467bdca97fa7cf5a574cd · openeuler / Kernel

09 5月, 2007 1 次提交

[POWERPC] Introduce address space "slices" · d0f13e3c

由 Benjamin Herrenschmidt 提交于 5月 08, 2007

The basic issue is to be able to do what hugetlbfs does but with
different page sizes for some other special filesystems; more
specifically, my need is:

 - Huge pages

 - SPE local store mappings using 64K pages on a 4K base page size
kernel on Cell

 - Some special 4K segments in 64K-page kernels for mapping a dodgy
type of powerpc-specific infiniband hardware that requires 4K MMU
mappings for various reasons I won't explain here.

The main issues are:

 - To maintain/keep track of the page size per "segment" (as we can
only have one page size per segment on powerpc, which are 256MB
divisions of the address space).

 - To make sure special mappings stay within their allotted
"segments" (including MAP_FIXED crap)

 - To make sure everybody else doesn't mmap/brk/grow_stack into a
"segment" that is used for a special mapping

Some of the necessary mechanisms to handle that were present in the
hugetlbfs code, but mostly in ways not suitable for anything else.

The patch relies on some changes to the generic get_unmapped_area()
that just got merged.  It still hijacks hugetlb callbacks here or
there as the generic code hasn't been entirely cleaned up yet but
that shouldn't be a problem.

So what is a slice ?  Well, I re-used the mechanism used formerly by our
hugetlbfs implementation which divides the address space in
"meta-segments" which I called "slices".  The division is done using
256MB slices below 4G, and 1T slices above.  Thus the address space is
divided currently into 16 "low" slices and 16 "high" slices.  (Special
case: high slice 0 is the area between 4G and 1T).

Doing so simplifies significantly the tracking of segments and avoids
having to keep track of all the 256MB segments in the address space.

While I used the "concepts" of hugetlbfs, I mostly re-implemented
everything in a more generic way and "ported" hugetlbfs to it.

Slices can have an associated page size, which is encoded in the mmu
context and used by the SLB miss handler to set the segment sizes.  The
hash code currently doesn't care, it has a specific check for hugepages,
though I might add a mechanism to provide per-slice hash mapping
functions in the future.

The slice code provide a pair of "generic" get_unmapped_area() (bottomup
and topdown) functions that should work with any slice size.  There is
some trickiness here so I would appreciate people to have a look at the
implementation of these and let me know if I got something wrong.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

d0f13e3c

24 4月, 2007 1 次提交

[POWERPC] Save trap number in bad_stack · 68730401

由 Olof Johansson 提交于 4月 24, 2007

Save the trap number in the case of getting a bad stack in an exception
handler. It is sometimes useful to know what exception it was that caused
this to happen. Without this, no trap number is reported.
Signed-off-by: NOlof Johansson <olof@lixom.net>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

68730401

21 3月, 2007 1 次提交

[POWERPC] Minor paca optimisation · e91948fd

由 Stephen Rothwell 提交于 3月 16, 2007

Move the slb_shadow_ptr field into the first cache line since it is
(like everything there) read-only after boot.  It is in fact statically
initialised and thereafter only read.
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

e91948fd

16 10月, 2006 1 次提交

[POWERPC] Lazy interrupt disabling for 64-bit machines · d04c56f7

由 Paul Mackerras 提交于 10月 04, 2006

This implements a lazy strategy for disabling interrupts. This means
that local_irq_disable() et al. just clear the 'interrupts are
enabled' flag in the paca. If an interrupt comes along, the interrupt
entry code notices that interrupts are supposed to be disabled, and
clears the EE bit in SRR1, clears the 'interrupts are hard-enabled'
flag in the paca, and returns. This means that interrupts only
actually get disabled in the processor when an interrupt comes along.

When interrupts are enabled by local_irq_enable() et al., the code
sets the interrupts-enabled flag in the paca, and then checks whether
interrupts got hard-disabled. If so, it also sets the EE bit in the
MSR to hard-enable the interrupts.

This has the potential to improve performance, and also makes it
easier to make a kernel that can boot on iSeries and on other 64-bit
machines, since this lazy-disable strategy is very similar to the
soft-disable strategy that iSeries already uses.

This version renames paca->proc_enabled to paca->soft_enabled, and
changes a couple of soft-disables in the kexec code to hard-disables,
which should fix the crash that Michael Ellerman saw. This doesn't
yet use a reserved CR field for the soft_enabled and hard_enabled
flags. This applies on top of Stephen Rothwell's patches to make it
possible to build a combined iSeries/other kernel.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

d04c56f7

13 9月, 2006 1 次提交

[POWERPC] Fix MMIO ops to provide expected barrier behaviour · f007cacf

由 Paul Mackerras 提交于 9月 13, 2006

This changes the writeX family of functions to have a sync instruction
before the MMIO store rather than after, because the generally expected
behaviour is that the device receiving the MMIO store can be guaranteed
to see the effects of any preceding writes to normal memory.

To preserve ordering between writeX and readX, and to preserve ordering
between preceding stores and the readX, the readX family of functions
have had an sync added before the load.

Although writeX followed by spin_unlock is not officially guaranteed
to keep the writeX inside the spin-locked region unless an mmiowb()
is used, there are currently drivers that depend on the previous
behaviour on powerpc, which was that the mmiowb wasn't actually required.
Therefore we have a per-cpu flag that is set by writeX, cleared by
__raw_spin_lock and mmiowb, and tested by __raw_spin_unlock.  If it is
set, __raw_spin_unlock does a sync and clears it.

This changes both 32-bit and 64-bit readX/writeX.  32-bit already has a
sync in __raw_spin_unlock (since lwsync doesn't exist on 32-bit), and thus
doesn't need the per-cpu flag.

Tested on G5 (PPC970) and POWER5.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

f007cacf

08 8月, 2006 1 次提交

[POWERPC] Implement SLB shadow buffer · 2f6093c8

由 Michael Neuling 提交于 8月 07, 2006

This adds a shadow buffer for the SLBs and regsiters it with PHYP.
Only the bolted SLB entries (top 3) are shadowed.

The SLB shadow buffer tells the hypervisor what the kernel needs to
have in the SLB for the kernel to be able to function.  The hypervisor
can use this information to speed up partition context switches.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

2f6093c8

15 6月, 2006 1 次提交

powerpc: Use 64k pages without needing cache-inhibited large pages · bf72aeba

由 Paul Mackerras 提交于 6月 15, 2006

Some POWER5+ machines can do 64k hardware pages for normal memory but
not for cache-inhibited pages. This patch lets us use 64k hardware
pages for most user processes on such machines (assuming the kernel
has been configured with CONFIG_PPC_64K_PAGES=y). User processes
start out using 64k pages and get switched to 4k pages if they use any
non-cacheable mappings.

With this, we use 64k pages for the vmalloc region and 4k pages for
the imalloc region. If anything creates a non-cacheable mapping in
the vmalloc region, the vmalloc region will get switched to 4k pages.
I don't know of any driver other than the DRM that would do this,
though, and these machines don't have AGP.

When a region gets switched from 64k pages to 4k pages, we do not have
to clear out all the 64k HPTEs from the hash table immediately. We
use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page
was hashed in as a 64k page or a set of 4k pages. If hash_page is
trying to insert a 4k page for a Linux PTE and it sees that it has
already been inserted as a 64k page, it first invalidates the 64k HPTE
before inserting the 4k HPTE. The hash invalidation routines also use
the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a
set of 4k HPTEs to remove. With those two changes, we can tolerate a
mix of 4k and 64k HPTEs in the hash table, and they will all get
removed when the address space is torn down.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

bf72aeba

12 6月, 2006 1 次提交

powerpc: Remove unused paca->pgdir field · 43064431

由 Paul Mackerras 提交于 6月 12, 2006

The pgdir field in the paca was a leftover from the dynamic VSIDs
patch, and is not used in the current kernel code.  This removes it.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

43064431

26 4月, 2006 1 次提交
- D
  Don't include linux/config.h from anywhere else in include/ · 62c4f0a2
  由 David Woodhouse 提交于 4月 26, 2006
```
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
```
  62c4f0a2
27 3月, 2006 1 次提交

[PATCH] powerpc: Allow non zero boot cpuids · 4df20460

由 Anton Blanchard 提交于 3月 25, 2006

We currently have a hack to flip the boot cpu and its secondary thread
to logical cpuid 0 and 1. This means the logical - physical mapping will
differ depending on which cpu is boot cpu. This is most apparent on
kexec, where we might kexec on any cpu and therefore change the mapping
from boot to boot.

The patch below does a first pass early on to work out the logical cpuid
of the boot thread. We then fix up some paca structures to match.

Ive also removed the boot_cpuid_phys variable for ppc64, to be
consistent we use get_hard_smp_processor_id(boot_cpuid) everywhere.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

4df20460

24 2月, 2006 1 次提交

powerpc: Implement accurate task and CPU time accounting · c6622f63

由 Paul Mackerras 提交于 2月 24, 2006

This implements accurate task and cpu time accounting for 64-bit
powerpc kernels.  Instead of accounting a whole jiffy of time to a
task on a timer interrupt because that task happened to be running at
the time, we now account time in units of timebase ticks according to
the actual time spent by the task in user mode and kernel mode.  We
also count the time spent processing hardware and software interrupts
accurately.  This is conditional on CONFIG_VIRT_CPU_ACCOUNTING.  If
that is not set, we do tick-based approximate accounting as before.

To get this accurate information, we read either the PURR (processor
utilization of resources register) on POWER5 machines, or the timebase
on other machines on

* each entry to the kernel from usermode
* each exit to usermode
* transitions between process context, hard irq context and soft irq
  context in kernel mode
* context switches.

On POWER5 systems with shared-processor logical partitioning we also
read both the PURR and the timebase at each timer interrupt and
context switch in order to determine how much time has been taken by
the hypervisor to run other partitions ("steal" time).  Unfortunately,
since we need values of the PURR on both threads at the same time to
accurately calculate the steal time, and since we can only calculate
steal time on a per-core basis, the apportioning of the steal time
between idle time (time which we ceded to the hypervisor in the idle
loop) and actual stolen time is somewhat approximate at the moment.

This is all based quite heavily on what s390 does, and it uses the
generic interfaces that were added by the s390 developers,
i.e. account_system_time(), account_user_time(), etc.

This patch doesn't add any new interfaces between the kernel and
userspace, and doesn't change the units in which time is reported to
userspace by things such as /proc/stat, /proc/<pid>/stat, getrusage(),
times(), etc.  Internally the various task and cpu times are stored in
timebase units, but they are converted to USER_HZ units (1/100th of a
second) when reported to userspace.  Some precision is therefore lost
but there should not be any accumulating error, since the internal
accumulation is at full precision.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

c6622f63

10 2月, 2006 1 次提交

[PATCH] powerpc: trivial: modify comments to refer to new location of files · 2ef9481e

由 Jon Mason 提交于 1月 23, 2006

This patch removes all self references and fixes references to files
in the now defunct arch/ppc64 tree.  I think this accomplises
everything wanted, though there might be a few references I missed.
Signed-off-by: NJon Mason <jdmason@us.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

2ef9481e

13 1月, 2006 1 次提交

[PATCH] powerpc: Remove lppaca structure from the PACA · 3356bb9f

由 David Gibson 提交于 1月 13, 2006

At present the lppaca - the structure shared with the iSeries
hypervisor and phyp - is contained within the PACA, our own low-level
per-cpu structure.  This doesn't have to be so, the patch below
removes it, making a separate array of lppaca structures.

This saves approximately 500*NR_CPUS bytes of image size and kernel
memory, because we don't need aligning gap between the Linux and
hypervisor portions of every PACA.  On the other hand it means an
extra level of dereference in many accesses to the lppaca.

The patch also gets rid of several places where we assign the paca
address to a local variable for no particular reason.
Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

3356bb9f

11 1月, 2006 1 次提交

[PATCH] powerpc/64: per cpu data optimisations · 7a0268fa

由 Anton Blanchard 提交于 1月 11, 2006

The current ppc64 per cpu data implementation is quite slow. eg:

lhz 11,18(13) /* smp_processor_id() */
ld 9,.LC63-.LCTOC1(30) /* per_cpu__variable_name */
ld 8,.LC61-.LCTOC1(30) /* __per_cpu_offset */
sldi 11,11,3 /* form index into __per_cpu_offset */
mr 10,9
ldx 9,11,8 /* __per_cpu_offset[smp_processor_id()] */
ldx 0,10,9 /* load per cpu data */

5 loads for something that is supposed to be fast, pretty awful. One
reason for the large number of loads is that we have to synthesize 2
64bit constants (per_cpu__variable_name and __per_cpu_offset).

By putting __per_cpu_offset into the paca we can avoid the 2 loads
associated with it:

ld 11,56(13) /* paca->data_offset */
ld 9,.LC59-.LCTOC1(30) /* per_cpu__variable_name */
ldx 0,9,11 /* load per cpu data

Longer term we can should be able to do even better than 3 loads.
If per_cpu__variable_name wasnt a 64bit constant and paca->data_offset
was in a register we could cut it down to one load. A suggestion from
Rusty is to use gcc's __thread extension here. In order to do this we
would need to free up r13 (the __thread register and where the paca
currently is). So far Ive had a few unsuccessful attempts at doing that :)

The patch also allocates per cpu memory node local on NUMA machines.
This patch from Rusty has been sitting in my queue _forever_ but stalled
when I hit the compiler bug. Sorry about that.

Finally I also only allocate per cpu data for possible cpus, which comes
straight out of the x86-64 port. On a pseries kernel (with NR_CPUS == 128)
and 4 possible cpus we see some nice gains:

total used free shared buffers cached
Mem: 4012228 212860 3799368 0 0 162424

total used free shared buffers cached
Mem: 4016200 212984 3803216 0 0 162424

A saving of 3.75MB. Quite nice for smaller machines. Note: we now have
to be careful of per cpu users that touch data for !possible cpus.

At this stage it might be worth making the NUMA and possible cpu
optimisations generic, but per cpu init is done so early we have to be
careful that all architectures have their possible map setup correctly.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

7a0268fa

09 1月, 2006 3 次提交

[PATCH] powerpc: sanitize header files for user space includes · 88ced031

由 Arnd Bergmann 提交于 12月 16, 2005

include/asm-ppc/ had #ifdef __KERNEL__ in all header files that
are not meant for use by user space, include/asm-powerpc does
not have this yet.

This patch gets us a lot closer there. There are a few cases
where I was not sure, so I left them out. I have verified
that no CONFIG_* symbols are used outside of __KERNEL__
any more and that there are no obvious compile errors when
including any of the headers in user space libraries.
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

88ced031

[PATCH] powerpc: Remove some unneeded fields from the paca · 404849bb

由 David Gibson 提交于 11月 24, 2005

This patch removes several unnecessary fields from the paca:

- next_jiffy_update_tb was simply unused.  Remove trivially.

- The exdsi exception save area was not used.  There were plans to use
  it, but they never seem to have gone anywhere.  If they ever do, we
  can put it back.  Remove from the paca, and from asm-offsets.c

- The default_decr field was used from asm, but was only ever assigned
  the value of tb_ticks_per_jiffy.  Just access tb_ticks_per_jiffy from
  asm directly instead.

Built and booted on POWER5 LPAR and iSeries RS64.
Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

404849bb

[PATCH] powerpc: Remove ItLpRegSave area from the paca · 1888e7b5

由 David Gibson 提交于 11月 24, 2005

On iSeries, the paca contains, amongst other things an ItLpRegSave
structure used by the hypervisor to save registers.  The hypervisor
locates this area through a pointer at the beginning of the paca, so
the structure itself can be located elsewhere.  This patch moves the
reg_save area out into its own array.  This reduces the amount of
iSeries specific gunk which is visible to general powerpc code via
paca.h

Built and booted on POWER5 LPAR and iSeries RS64.
Signed-off-by: NDavid Gibson <dwg@au1.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

1888e7b5

10 11月, 2005 1 次提交

[PATCH] powerpc: Move various ppc64 files with no ppc32 equivalent to powerpc · 8882a4da

由 David Gibson 提交于 11月 09, 2005

This patch moves a bunch of files from arch/ppc64 and
include/asm-ppc64 which have no equivalents in ppc32 code into
arch/powerpc and include/asm-powerpc.  The file affected are:
	abs_addr.h
	compat.h
	lppaca.h
	paca.h
	tce.h
	cpu_setup_power4.S
	ioctl32.c
	firmware.c
	pacaData.c

The only changes apart from the move and corresponding Makefile
changes are:
	- #ifndef/#define in includes updated to _ASM_POWERPC_ form
	- trailing whitespace removed
	- comments giving full paths removed
	- pacaData.c renamed paca.c to remove studlyCaps
	- Misplaced { moved in lppaca.h

Built and booted on POWER5 LPAR (ARCH=powerpc and ARCH=ppc64), built
for 32-bit powermac (ARCH=powerpc).
Signed-off-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

8882a4da

07 11月, 2005 1 次提交

[PATCH] ppc64: support 64k pages · 3c726f8d

由 Benjamin Herrenschmidt 提交于 11月 07, 2005

Adds a new CONFIG_PPC_64K_PAGES which, when enabled, changes the kernel
base page size to 64K. The resulting kernel still boots on any
hardware. On current machines with 4K pages support only, the kernel
will maintain 16 "subpages" for each 64K page transparently.

Note that while real 64K capable HW has been tested, the current patch
will not enable it yet as such hardware is not released yet, and I'm
still verifying with the firmware architects the proper to get the
information from the newer hypervisors.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3c726f8d

02 11月, 2005 1 次提交
- K
  merge filename and modify references to iseries/it_lp_reg_save.h · 59ce20bb
  由 Kelly Daly 提交于 11月 02, 2005
```
Signed-off-by: NKelly Daly <kelly@au.ibm.com>
```
  59ce20bb
30 6月, 2005 2 次提交

[PATCH] ppc64: Simplify counting of lpevents, remove lpevent_count from paca · ed094150

由 Michael Ellerman 提交于 6月 30, 2005

Currently there's a per-cpu count of lpevents processed, a per-queue (ie.
global) total count, and a count by event type.

Replace all that with a count by event for each cpu. We only need to add
it up int the proc code.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Acked-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

ed094150

[PATCH] ppc64: Remove lpqueue pointer from the paca on iSeries · bea248fb

由 Michael Ellerman 提交于 6月 30, 2005

The iSeries code keeps a pointer to the ItLpQueue in its paca struct. But
all these pointers end up pointing to the one place, ie. xItLpQueue.

So remove the pointer from the paca struct and just refer to xItLpQueue
directly where needed.

The only complication is that the spread_lpevents logic was implemented by
having a NULL lpqueue pointer in the paca on CPUs that weren't supposed to
process events. Instead we just compare the spread_lpevents value to the
processor id to get the same behaviour.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Acked-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

bea248fb

22 6月, 2005 1 次提交

[PATCH] ppc64 iSeries: tidy up some includes and HvCall.h · 4a5304f5

由 Stephen Rothwell 提交于 6月 21, 2005

This patch removes some unused bits from HvCall.h and some unneeded #includes
from other files.  Also includes ItLpQueue.h in paca.h in preference to a stub
declaration of struct ItLpQueue.
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4a5304f5

17 4月, 2005 1 次提交

Linux-2.6.12-rc2 · 1da177e4

由 Linus Torvalds 提交于 4月 16, 2005

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

1da177e4

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功