提交 · fc53b4202e61c7e9008c241933ae282aab8a6082 · openanolis / cloud-kernel

31 7月, 2010 1 次提交

powerpc/kexec: Switch to a static PACA on the way out · fc53b420

由 Matt Evans 提交于 7月 07, 2010

With dynamic PACAs, the kexecing CPU's PACA won't lie within the kernel
static data and there is a chance that something may stomp it when preparing
to kexec.  This patch switches this final CPU to a static PACA just before
we pull the switch.
Signed-off-by: NMatt Evans <matt@ozlabs.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

fc53b420

14 7月, 2010 2 次提交

powerpc/book3e: Adjust the page sizes list based on MMU config · f2b26c92

由 Benjamin Herrenschmidt 提交于 7月 09, 2010

Use the MMU config registers to scan for available direct and
indirect page sizes and print out the result. Will be needed
for future hugetlbfs implementation.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f2b26c92

powerpc/book3e: Add generic 64-bit idle powersave support · 34d97e07

由 Benjamin Herrenschmidt 提交于 7月 14, 2010

We use a similar technique to ppc32: We set a thread local flag
to indicate that we are about to enter or have entered the stop
state, and have fixup code in the async interrupt entry code that
reacts to this flag to make us return to a different location
(sets NIP to LINK in our case).
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
--
v2. Fix lockdep bug
    Re-mask interrupts when coming back from idle

34d97e07

09 7月, 2010 12 次提交

powerpc/book3e: Resend doorbell exceptions to ourself · 850f22d5

由 Michael Ellerman 提交于 7月 09, 2010

If we are soft disabled and receive a doorbell exception we don't process
it immediately. This means we need to check on the way out of irq restore
if there are any doorbell exceptions to process.

The problem is at that point we don't know what our regs are, and that
in turn makes xmon unhappy. To workaround the problem, instead of checking
for and processing doorbells, we check for any doorbells and if there were
any we send ourselves another.
Signed-off-by: NMichael Ellerman <michael@ellerman.id.au>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

850f22d5

powerpc/book3e: More doorbell cleanups. Sample the PIR register · b9f1cd71

由 Benjamin Herrenschmidt 提交于 7月 09, 2010

The doorbells use the content of the PIR register to match messages
from other CPUs. This may or may not be the same as our linux CPU
number, so using that as the "target" is no right.

Instead, we sample the PIR register at boot on every processor
and use that value subsequently when sending IPIs.

We also use a per-cpu message mask rather than a global array which
should limit cache line contention.

Note: We could use the CPU number in the device-tree instead of
the PIR register, as they are supposed to be equivalent. This
might prove useful if doorbells are to be used to kick CPUs out
of FW at boot time, thus before we can sample the PIR. This is
however not the case now and using the PIR just works.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b9f1cd71

powerpc/book3e: Hack to get gdb moving along on Book3E 64-bit · a2e19811

由 Benjamin Herrenschmidt 提交于 7月 09, 2010

Our handling of debug interrupts on Book3E 64-bit is not quite
the way it should be just yet. This is a workaround to let gdb
work at least for now. We ensure that when context switching,
we set the appropriate DBCR0 value for the new task. We also
make sure that we turn off MSR[DE] within the kernel, and set
it as part of the bits that get set when going back to userspace.

In the long run, we will probably set the userspace DBCR0 on the
exception exit code path and ensure we have some proper kernel
value to set on the way into the kernel, a bit like ppc32 does,
but that will take more work.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

a2e19811

B
powerpc/book3e: mtmsr should not be mtmsrd on book3e 64-bit · 0866eb99
由 Benjamin Herrenschmidt 提交于 7月 09, 2010
```
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
```
0866eb99

powerpc/numa: Use form 1 affinity to setup node distance · 41eab6f8

由 Anton Blanchard 提交于 5月 16, 2010

Form 1 affinity allows multiple entries in ibm,associativity-reference-points
which represent affinity domains in decreasing order of importance. The
Linux concept of a node is always the first entry, but using the other
values as an input to node_distance() allows the memory allocator to make
better decisions on which node to go first when local memory has been
exhausted.

We keep things simple and create an array indexed by NUMA node, capped at
4 entries. Each time we lookup an associativity property we initialise
the array which is overkill, but since we should only hit this path during
boot it didn't seem worth adding a per node valid bit.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

41eab6f8

powerpc/pseries: Rename RAS_VECTOR_OFFSET to RTAS_VECTOR_EXTERNAL_INTERRUPT and move to rtas.h · b08e281b

由 Mark Nelson 提交于 5月 26, 2010

The RAS code has a #define, RAS_VECTOR_OFFSET, that's used in the
check-exception RTAS call for the vector offset of the exception.

We'll be using this same vector offset for the upcoming IO Event interrupts
code (0x500) so let's move it to include/asm/rtas.h and call it
RTAS_VECTOR_EXTERNAL_INTERRUPT.
Signed-off-by: NMark Nelson <markn@au1.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b08e281b

powerpc: Optimise per cpu accesses on 64bit · ae01f84b

由 Anton Blanchard 提交于 5月 31, 2010

Now we dynamically allocate the paca array, it takes an extra load
whenever we want to access another cpu's paca. One place we do that a lot
is per cpu variables. A simple example:

DEFINE_PER_CPU(unsigned long, vara);
unsigned long test4(int cpu)
{
	return per_cpu(vara, cpu);
}

This takes 4 loads, 5 if you include the actual load of the per cpu variable:

    ld r11,-32760(r30)  # load address of paca pointer
    ld r9,-32768(r30)   # load link address of percpu variable
    sldi r3,r29,9       # get offset into paca (each entry is 512 bytes)
    ld r0,0(r11)        # load paca pointer
    add r3,r0,r3        # paca + offset
    ld r11,64(r3)       # load paca[cpu].data_offset

    ldx r3,r9,r11       # load per cpu variable

If we remove the ppc64 specific per_cpu_offset(), we get the generic one
which indexes into a statically allocated array. This removes one load and
one add:

    ld r11,-32760(r30)  # load address of __per_cpu_offset
    ld r9,-32768(r30)   # load link address of percpu variable
    sldi r3,r29,3       # get offset into __per_cpu_offset (each entry 8 bytes)
    ldx r11,r11,r3      # load __per_cpu_offset[cpu]

    ldx r3,r9,r11       # load per cpu variable

Having all the offsets in one array also helps when iterating over a per cpu
variable across a number of cpus, such as in the scheduler. Before we would
need to load one paca cacheline when calculating each per cpu offset. Now we
have 16 (128 / sizeof(long)) per cpu offsets in each cacheline.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

ae01f84b

powerpc/iseries: Fix constant warning · 51c7fdba

由 Denis Kirjanov 提交于 6月 16, 2010

Fix smatch warning: constant 0x8000000000000000 is so big it is unsigned long
Signed-off-by: NDenis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

51c7fdba

powerpc/pseries: Partition hibernation support · 32d8ad4e

由 Brian King 提交于 7月 07, 2010

Enables support for HMC initiated partition hibernation. This is
a firmware assisted hibernation, since the firmware handles writing
the memory out to disk, along with other partition information,
so we just mimic suspend to ram.
Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

32d8ad4e

powerpc/pseries: Migration code reorganization / hibernation prep · 8fe93f8d

由 Brian King 提交于 7月 07, 2010

Partition hibernation will use some of the same code as is
currently used for Live Partition Migration. This function
further abstracts this code such that code outside of rtas.c
can utilize it. It also changes the error field in the suspend
me data structure to be an atomic type, since it is set and
checked on different cpus without any barriers or locking.
Signed-off-by: NBrian King <brking@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8fe93f8d

powerpc: Clean up obsolete code relating to decrementer and timebase · c1aa687d

由 Paul Mackerras 提交于 6月 20, 2010

Since the decrementer and timekeeping code was moved over to using
the generic clockevents and timekeeping infrastructure, several
variables and functions have been obsolete and effectively unused.
This deletes them.

In particular, wakeup_decrementer() is no longer needed since the
generic code reprograms the decrementer as part of the process of
resuming the timekeeping code, which happens during sysdev resume.
Thus the wakeup_decrementer calls in the suspend_enter methods for
52xx platforms have been removed.  The call in the powermac cpu
frequency change code has been replaced by set_dec(1), which will
cause a timer interrupt as soon as interrupts are enabled, and the
generic code will then reprogram the decrementer with the correct
value.

This also simplifies the generic_suspend_en/disable_irqs functions
and makes them static since they are not referenced outside time.c.
The preempt_enable/disable calls are removed because the generic
code has disabled all but the boot cpu at the point where these
functions are called, so we can't be moved to another cpu.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c1aa687d

powerpc: Rework VDSO gettimeofday to prevent time going backwards · 8fd63a9e

由 Paul Mackerras 提交于 6月 20, 2010

Currently it is possible for userspace to see the result of
gettimeofday() going backwards by 1 microsecond, assuming that
userspace is using the gettimeofday() in the VDSO.  The VDSO
gettimeofday() algorithm computes the time in "xsecs", which are
units of 2^-20 seconds, or approximately 0.954 microseconds,
using the algorithm

	now = (timebase - tb_orig_stamp) * tb_to_xs + stamp_xsec

and then converts the time in xsecs to seconds and microseconds.

The kernel updates the tb_orig_stamp and stamp_xsec values every
tick in update_vsyscall().  If the length of the tick is not an
integer number of xsecs, then some precision is lost in converting
the current time to xsecs.  For example, with CONFIG_HZ=1000, the
tick is 1ms long, which is 1048.576 xsecs.  That means that
stamp_xsec will advance by either 1048 or 1049 on each tick.
With the right conditions, it is possible for userspace to get
(timebase - tb_orig_stamp) * tb_to_xs being 1049 if the kernel is
slightly late in updating the vdso_datapage, and then for stamp_xsec
to advance by 1048 when the kernel does update it, and for userspace
to then see (timebase - tb_orig_stamp) * tb_to_xs being zero due to
integer truncation.  The result is that time appears to go backwards
by 1 microsecond.

To fix this we change the VDSO gettimeofday to use a new field in the
VDSO datapage which stores the nanoseconds part of the time as a
fractional number of seconds in a 0.32 binary fraction format.
(Or put another way, as a 32-bit number in units of 0.23283 ns.)
This is convenient because we can use the mulhwu instruction to
convert it to either microseconds or nanoseconds.

Since it turns out that computing the time of day using this new field
is simpler than either using stamp_xsec (as gettimeofday does) or
stamp_xtime.tv_nsec (as clock_gettime does), this converts both
gettimeofday and clock_gettime to use the new field.  The existing
__do_get_tspec function is converted to use the new field and take
a parameter in r7 that indicates the desired resolution, 1,000,000
for microseconds or 1,000,000,000 for nanoseconds.  The __do_get_xsec
function is then unused and is deleted.

The new algorithm is

	now = ((timebase - tb_orig_stamp) << 12) * tb_to_xs
		+ (stamp_xtime_seconds << 32) + stamp_sec_fraction

with 'now' in units of 2^-32 seconds.  That is then converted to
seconds and either microseconds or nanoseconds with

	seconds = now >> 32
	partseconds = ((now & 0xffffffff) * resolution) >> 32

The 32-bit VDSO code also makes a further simplification: it ignores
the bottom 32 bits of the tb_to_xs value, which is a 0.64 format binary
fraction.  Doing so gets rid of 4 multiply instructions.  Assuming
a timebase frequency of 1GHz or less and an update interval of no
more than 10ms, the upper 32 bits of tb_to_xs will be at least
4503599, so the error from ignoring the low 32 bits will be at most
2.2ns, which is more than an order of magnitude less than the time
taken to do gettimeofday or clock_gettime on our fastest processors,
so there is no possibility of seeing inconsistent values due to this.

This also moves update_gtod() down next to its only caller, and makes
update_vsyscall use the time passed in via the wall_time argument rather
than accessing xtime directly.  At present, wall_time always points to
xtime, but that could change in future.
Signed-off-by: NPaul Mackerras <paulus@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

8fd63a9e

08 7月, 2010 1 次提交

powerpc: Fix userspace build of ptrace.h · bf23690b

由 Sam Ravnborg 提交于 5月 09, 2010

Build of ptrace.h failed for assembly because it
pulls in stdint.h.
Use exportable types (__u32, __u64) to avoid the dependency
on stdint.h.
Signed-off-by: NSam Ravnborg <sam@ravnborg.org>
Cc: Andrey Volkov <avolkov@varma-el.com>
Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: NDave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

bf23690b

30 6月, 2010 1 次提交

powerpc, hw_breakpoint: Tell generic code we have no instruction breakpoints · d09ec738

由 Paul Mackerras 提交于 6月 29, 2010

At present, hw_breakpoint_slots() returns 1 regardless of what
type of breakpoint is specified in the type argument.  Since we
don't define CONFIG_HAVE_MIXED_BREAKPOINTS_REGS, there are
separate values for TYPE_INST and TYPE_DATA, and hw_breakpoint_slots()
returns 1 for both, effectively advertising instruction breakpoint
support which doesn't exist.

This fixes it by making hw_breakpoint_slots return 1 for TYPE_DATA
and 0 for TYPE_INST.  This moves hw_breakpoint_slots() from the
powerpc hw_breakpoint.h to hw_breakpoint.c because the definitions
of TYPE_INST and TYPE_DATA aren't available in <asm/hw_breakpoint.h>.
They are defined in <linux/hw_breakpoint.h> but we can't include
that header in <asm/hw_breakpoint.h>, and nor can we rely on
<linux/hw_breakpoint.h> being included before <asm/hw_breakpoint.h>.
Since hw_breakpoint_slots() is only called at boot time, there is
no performance impact from making it a real function rather than
a static inline.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

d09ec738

22 6月, 2010 4 次提交

powerpc, hw_breakpoint: Discard extraneous interrupt due to accesses outside symbol length · e3e94084

由 K.Prasad 提交于 6月 15, 2010

Many a times, the requested breakpoint length can be less than the
fixed breakpoint length i.e. 8 bytes supported by PowerPC 64-bit
server (Book III S) processors.  This could lead to extraneous
interrupts resulting in false breakpoint notifications.  This
detects and discards such interrupts for non-ptrace requests.
We don't change ptrace behaviour to avoid breaking compatability.

[Suggestion from Paul Mackerras <paulus@samba.org> to add a new flag in
'struct arch_hw_breakpoint' to identify extraneous interrupts]
Signed-off-by: NK.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

e3e94084

powerpc, hw_breakpoint: Enable hw-breakpoints while handling intervening signals · 06532a67

由 K.Prasad 提交于 6月 15, 2010

A signal delivered between a hw_breakpoint_handler() and the
single_step_dabr_instruction() will not have the breakpoint active
while the signal handler is running -- the signal delivery will
set up a new MSR value which will not have MSR_SE set, so we
won't get the signal step interrupt until and unless the signal
handler returns (which it may never do).

To fix this, we restore the breakpoint when delivering a signal --
we clear the MSR_SE bit and set the DABR again.  If the signal
handler returns, the DABR interrupt will occur again when the
instruction that we were originally trying to single-step gets
re-executed.

[Paul Mackerras <paulus@samba.org> pointed out the need to do this.]
Signed-off-by: NK.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

06532a67

powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors · 5aae8a53

由 K.Prasad 提交于 6月 15, 2010

Implement perf-events based hw-breakpoint interfaces for PowerPC
64-bit server (Book III S) processors.  This allows access to a
given location to be used as an event that can be counted or
profiled by the perf_events subsystem.

This is done using the DABR (data breakpoint register), which can
also be used for process debugging via ptrace.  When perf_event
hw_breakpoint support is configured in, the perf_event subsystem
manages the DABR and arbitrates access to it, and ptrace then
creates a perf_event when it is requested to set a data breakpoint.

[Adopted suggestions from Paul Mackerras <paulus@samba.org> to
- emulate_step() all system-wide breakpoints and single-step only the
  per-task breakpoints
- perform arch-specific cleanup before unregistration through
  arch_unregister_hw_breakpoint()
]
Signed-off-by: NK.Prasad <prasad@linux.vnet.ibm.com>
Signed-off-by: NPaul Mackerras <paulus@samba.org>

5aae8a53

powerpc: Emulate most Book I instructions in emulate_step() · 0016a4cf

由 Paul Mackerras 提交于 6月 15, 2010

This extends the emulate_step() function to handle a large proportion
of the Book I instructions implemented on current 64-bit server
processors.  The aim is to handle all the load and store instructions
used in the kernel, plus all of the instructions that appear between
l[wd]arx and st[wd]cx., so this handles the Altivec/VMX lvx and stvx
and the VSX lxv2dx and stxv2dx instructions (implemented in POWER7).

The new code can emulate user mode instructions, and checks the
effective address for a load or store if the saved state is for
user mode.  It doesn't handle little-endian mode at present.

For floating-point, Altivec/VMX and VSX instructions, it checks
that the saved MSR has the enable bit for the relevant facility
set, and if so, assumes that the FP/VMX/VSX registers contain
valid state, and does loads or stores directly to/from the
FP/VMX/VSX registers, using assembly helpers in ldstfp.S.

Instructions supported now include:
* Loads and stores, including some but not all VMX and VSX instructions,
  and lmw/stmw
* Atomic loads and stores (l[dw]arx, st[dw]cx.)
* Arithmetic instructions (add, subtract, multiply, divide, etc.)
* Compare instructions
* Rotate and mask instructions
* Shift instructions
* Logical instructions (and, or, xor, etc.)
* Condition register logical instructions
* mtcrf, cntlz[wd], exts[bhw]
* isync, sync, lwsync, ptesync, eieio
* Cache operations (dcbf, dcbst, dcbt, dcbtst)

The overflow-checking arithmetic instructions are not included, but
they appear not to be ever used in C code.

This uses decimal values for the minor opcodes in the switch statements
because that is what appears in the Power ISA specification, thus it is
easier to check that they are correct if they are in decimal.

If this is used to single-step an instruction where a data breakpoint
interrupt occurred, then there is the possibility that the instruction
is a lwarx or ldarx.  In that case we have to be careful not to lose the
reservation until we get to the matching st[wd]cx., or we'll never make
forward progress.  One alternative is to try to arrange that we can
return from interrupts and handle data breakpoint interrupts without
losing the reservation, which means not using any spinlocks, mutexes,
or atomic ops (including bitops).  That seems rather fragile.  The
other alternative is to emulate the larx/stcx and all the instructions
in between.  This is why this commit adds support for a wide range
of integer instructions.
Signed-off-by: NPaul Mackerras <paulus@samba.org>

0016a4cf

15 6月, 2010 2 次提交

powerpc: Unconditionally enabled irq stacks · f1ba9a5b

由 Christoph Hellwig 提交于 6月 02, 2010

Irq stacks provide an essential protection from stack overflows through
external interrupts, at the cost of two additionals stacks per CPU.

Enable them unconditionally to simplify the kernel build and prevent
people from accidentally disabling them.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f1ba9a5b

powerpc: Move kdump default base address to 64MB on 64bit · b5416ca9

由 Anton Blanchard 提交于 6月 07, 2010

We are seeing boot fails on some System p machines when using the kdump
crashkernel= boot option. The default kdump base address is 32MB, so if we
reserve 256MB for kdump then we reserve all of the RMO except the first 32MB.

We really want kdump to reserve some memory in the RMO and most of it
elsewhere but that will require more significant changes. For now we can shift
the default base address to 64MB when CONFIG_PPC64 and CONFIG_RELOCATABLE are
set. This isn't quite correct since what we really care about is the kdump
kernel is relocatable, but we already make the assumption that base kernel
and kdump kernel have the same CONFIG_RELOCATABLE setting, eg:

#ifndef CONFIG_RELOCATABLE
        if (crashk_res.start != KDUMP_KERNELBASE)
                printk("Crash kernel location must be 0x%x\n",
                                KDUMP_KERNELBASE);
...

RTAS is instantiated towards the top of our RMO, so if we were to go any
higher we risk not having enough RMO memory for the kdump kernel on boxes
with a 128MB RMO.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

b5416ca9

02 6月, 2010 1 次提交

powerpc/macio: Fix probing of macio devices by using the right of match table · c2cdf6ab

由 Benjamin Herrenschmidt 提交于 6月 02, 2010

Grant patches added an of mach table to struct device_driver. However,
while he changed the macio device code to use that, he left the match
table pointer in struct macio_driver and didn't update drivers to use
the "new" one, thus breaking the probing.

This completes the change by moving all drivers to setup the "new"
one, removing all traces of the old one, and while at it (since it
changes the exact same locations), I also remove two other duplicates
from struct driver which are the name and owner fields.
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

c2cdf6ab

28 5月, 2010 2 次提交

asm-generic: remove ARCH_HAS_SG_CHAIN in scatterlist.h · 1ef04370

由 FUJITA Tomonori 提交于 5月 26, 2010

There are more architectures that don't support ARCH_HAS_SG_CHAIN than
those that support it. This removes removes ARCH_HAS_SG_CHAIN in
asm-generic/scatterlist.h and lets arhictectures to define it.

It's clearer than defining ARCH_HAS_SG_CHAIN asm-generic/scatterlist.h and
undefing it in arhictectures that don't support it.
Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1ef04370

powerpc: use asm-generic/scatterlist.h · e32205eb

由 FUJITA Tomonori 提交于 5月 26, 2010

Signed-off-by: NFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e32205eb

26 5月, 2010 1 次提交

Revert "endian: #define __BYTE_ORDER" · 13da9e20

由 Linus Torvalds 提交于 5月 26, 2010

This reverts commit b3b77c8c, which was
also totally broken (see commit 0d2daf5c that reverted the crc32
version of it).  As reported by Stephen Rothwell, it causes problems on
big-endian machines:

> In file included from fs/jfs/jfs_types.h:33,
>                  from fs/jfs/jfs_incore.h:26,
>                  from fs/jfs/file.c:22:
> fs/jfs/endian24.h:36:101: warning: "__LITTLE_ENDIAN" is not defined

The kernel has never had that crazy "__BYTE_ORDER == __LITTLE_ENDIAN"
model.  It's not how we do things, and it isn't how we _should_ do
things.  So don't go there.
Requested-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

13da9e20

25 5月, 2010 3 次提交

endian: #define __BYTE_ORDER · b3b77c8c

由 Joakim Tjernlund 提交于 5月 24, 2010

Linux does not define __BYTE_ORDER in its endian header files which makes
some header files bend backwards to get at the current endian.  Lets
#define __BYTE_ORDER in big_endian.h/litte_endian.h to make it easier for
header files that are used in user space too.

In userspace the convention is that

  1. _both_ __LITTLE_ENDIAN and __BIG_ENDIAN are defined,
  2. you have to test for e.g. __BYTE_ORDER == __BIG_ENDIAN.
Signed-off-by: NJoakim Tjernlund <Joakim.Tjernlund@transmode.se>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b3b77c8c

spi/mpc5121: Add SPI master driver for MPC5121 PSC · 6e27388f

由 Anatolij Gustschin 提交于 4月 30, 2010

Signed-off-by: NJohn Rigby <jcrigby@gmail.com>
Signed-off-by: NAnatolij Gustschin <agust@denx.de>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

6e27388f

powerpc/kexec: Add support for FSL-BookE · b3df895a

由 Sebastian Andrzej Siewior 提交于 4月 04, 2010

This adds support kexec on FSL-BookE where the MMU can not be simply
switched off. The code borrows the initial MMU-setup code to create the
identical mapping mapping. The only difference to the original boot code
is the size of the mapping(s) and the executeable address.
The kexec code maps the first 2 GiB of memory in 256 MiB steps. This
should work also on e500v1 boxes.
SMP support is still not available.

(Kumar: Added minor change to build to ifdef CONFIG_PPC_STD_MMU_64 some
code that was PPC64 specific)
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

b3df895a

22 5月, 2010 1 次提交

arch/powerpc: Move dma_mask from of_device into pdev_archdata · cb6dc512

由 Grant Likely 提交于 4月 13, 2010

By moving dma_mask into pdev_archdata, and adding archdata to
struct of_device, it makes it possible to substitute of_device
with struct platform_device, which is a stepping stone to
removing the of_platform bus entirely.
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

cb6dc512

21 5月, 2010 5 次提交

powerpc/e500mc: Implement machine check handler. · fe04b112

由 Scott Wood 提交于 4月 08, 2010

Most of the MSCR bit assigments are different in e500mc versus
e500, and they are now write-one-to-clear.

Some e500mc machine check conditions are made recoverable (as long as
they aren't stuck on), most notably L1 instruction cache parity errors.
Signed-off-by: NScott Wood <scottwood@freescale.com>
Signed-off-by: NKumar Gala <galak@kernel.crashing.org>

fe04b112

powerpc/numa: Set a smaller value for RECLAIM_DISTANCE to enable zone reclaim · 56608209

由 Anton Blanchard 提交于 5月 16, 2010

I noticed /proc/sys/vm/zone_reclaim_mode was 0 on a ppc64 NUMA box. It gets
enabled via this:

        /*
         * If another node is sufficiently far away then it is better
         * to reclaim pages in a zone before going off node.
         */
        if (distance > RECLAIM_DISTANCE)
                zone_reclaim_mode = 1;

Since we use the default value of 20 for REMOTE_DISTANCE and 20 for
RECLAIM_DISTANCE it never kicks in.

The local to remote bandwidth ratios can be quite large on System p
machines so it makes sense for us to reclaim clean pagecache locally before
going off node.

The patch below sets a smaller value for RECLAIM_DISTANCE and thus enables
zone reclaim.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

56608209

powerpc/kexec: Fix race in kexec shutdown · 1fc711f7

由 Michael Neuling 提交于 5月 13, 2010

In kexec_prepare_cpus, the primary CPU IPIs the secondary CPUs to
kexec_smp_down().  kexec_smp_down() calls kexec_smp_wait() which sets
the hw_cpu_id() to -1.  The primary does this while leaving IRQs on
which means the primary can take a timer interrupt which can lead to
the IPIing one of the secondary CPUs (say, for a scheduler re-balance)
but since the secondary CPU now has a hw_cpu_id = -1, we IPI CPU
-1... Kaboom!

We are hitting this case regularly on POWER7 machines.

There is also a second race, where the primary will tear down the MMU
mappings before knowing the secondaries have entered real mode.

Also, the secondaries are clearing out any pending IPIs before
guaranteeing that no more will be received.

This changes kexec_prepare_cpus() so that we turn off IRQs in the
primary CPU much earlier.  It adds a paca flag to say that the
secondaries have entered the kexec_smp_down() IPI and turned off IRQs,
rather than overloading hw_cpu_id with -1.  This new paca flag is
again used to in indicate when the secondaries has entered real mode.

It also ensures that all CPUs have their IRQs off before we clear out
any pending IPI requests (in kexec_cpu_down()) to ensure there are no
trailing IPIs left unacknowledged.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

1fc711f7

powerpc/pseries: Add hcall to read 4 ptes at a time in real mode · f90ece28

由 Michael Neuling 提交于 5月 10, 2010

This adds plpar_pte_read_4_raw() which can be used read 4 PTEs from
PHYP at a time, while in real mode.

It also creates a new hcall9 which can be used in real mode.  It's the
same as plpar_hcall9 but minus the tracing hcall statistics which may
require variables outside the RMO.
Signed-off-by: NMichael Neuling <mikey@neuling.org>
Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>

f90ece28

kdb: core for kgdb back end (2 of 2) · 67fc4e0c

由 Jason Wessel 提交于 5月 20, 2010

This patch contains the hooks and instrumentation into kernel which
live outside the kernel/debug directory, which the kdb core
will call to run commands like lsmod, dmesg, bt etc...

CC: linux-arch@vger.kernel.org
Signed-off-by: NJason Wessel <jason.wessel@windriver.com>
Signed-off-by: NMartin Hicks <mort@sgi.com>

67fc4e0c

19 5月, 2010 3 次提交

panic: Allow warnings to set different taint flags · b2be0527

由 Ben Hutchings 提交于 4月 03, 2010

WARN() is used in some places to report firmware or hardware bugs that
are then worked-around.  These bugs do not affect the stability of the
kernel and should not set the flag for TAINT_WARN.  To allow for this,
add WARN_TAINT() and WARN_TAINT_ONCE() macros that take a taint number
as argument.

Architectures that implement warnings using trap instructions instead
of calls to warn_slowpath_*() now implement __WARN_TAINT(taint)
instead of __WARN().
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Acked-by: NHelge Deller <deller@gmx.de>
Tested-by: NPaul Mundt <lethal@linux-sh.org>
Signed-off-by: NDavid Woodhouse <David.Woodhouse@intel.com>

b2be0527

of: eliminate of_device->node and dev_archdata->{of,prom}_node · 58f9b0b0

由 Grant Likely 提交于 4月 13, 2010

This patch eliminates the node pointer from struct of_device and the
of_node (or prom_node) pointer from struct dev_archdata since the node
pointer is now part of struct device proper when CONFIG_OF is set, and
all users of the old pointer locations have already been converted over
to use device->of_node.

Also remove dev_archdata_{get,set}_node() as it is no longer used by
anything.
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

58f9b0b0

of: Always use 'struct device.of_node' to get device node pointer. · 61c7a080

由 Grant Likely 提交于 4月 13, 2010

The following structure elements duplicate the information in
'struct device.of_node' and so are being eliminated.  This patch
makes all readers of these elements use device.of_node instead.

(struct of_device *)->node
(struct dev_archdata *)->prom_node (sparc)
(struct dev_archdata *)->of_node (powerpc & microblaze)
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

61c7a080

17 5月, 2010 1 次提交

KVM: PPC: Enable native paired singles · b83d4a9c

由 Alexander Graf 提交于 4月 20, 2010

When we're on a paired single capable host, we can just always enable
paired singles and expose them to the guest directly.

This approach breaks when multiple VMs run and access PS concurrently,
but this should suffice until we get a proper framework for it in Linux.
Signed-off-by: NAlexander Graf <agraf@suse.de>
Signed-off-by: NAvi Kivity <avi@redhat.com>

b83d4a9c

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功