提交 · bcc28ee0bf390df0d81cc9dafe980faef6b2771a · Linux-御风守护者 / linux

20 3月, 2006 23 次提交

[SPARC64]: Fix bugs in SUN4V cpu mondo dispatch. · b830ab66

由 David S. Miller 提交于 2月 28, 2006

There were several bugs in the SUN4V cpu mondo dispatch code.

In fact, if we ever got a EWOULDBLOCK or other error from
the hypervisor call, we'd potentially send a cpu mondo multiple
times to the same cpu and even worse we could loop until the
timeout resending the same mondo over and over to such cpus.

So let's bulletproof this thing as follows:

1) Implement cpu_mondo_send() and cpu_state() hypervisor calls
   in arch/sparc64/kernel/entry.S, add prototypes to asm/hypervisor.h

2) Don't build and update the cpulist using inline functions, this
   was causing the cpu mask to not get updated in the caller.

3) Disable interrupts during the entire mondo send, otherwise our
   cpu list and/or mondo block could get overwritten if we take
   an interrupt and do a cpu mondo send on the current cpu.

4) Check for all possible error return types from the cpu_mondo_send()
   hypervisor call.  In particular:

   HV_EOK) Our work is done, all cpus have received the mondo.
   HV_CPUERROR) One or more of the cpus in the cpu list we passed
                to the hypervisor are in error state.  Use cpu_state()
                calls over the entries in the cpu list to see which
		ones.  Record them in "error_mask" and report this
		after we are done sending the mondo to cpus which are
		not in error state.
   HV_EWOULDBLOCK) We need to keep trying.

   Any other error we consider fatal, we report the event and exit
   immediately.

5) We only timeout if forward progress is not made.  Forward progress
   is defined as having at least one cpu get the mondo successfully
   in a given cpu_mondo_send() call.  Otherwise we bump a counter
   and delay a little.  If the counter hits a limit, we signal an
   error and report the event.

Also, smp_call_function_mask() error handling reports the number
of cpus incorrectly.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b830ab66

[SPARC64]: Fix bugs in SMP TLB context version expiration handling. · aac0aadf

由 David S. Miller 提交于 2月 27, 2006

1) We must flush the TLB, duh.

2) Even if the sw context was seen to be valid, the local cpu's
   hw context can be out of date, so reload it unconditionally.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aac0aadf

D
[SPARC64]: Report mondo error correctly in hypervisor_xcall_deliver(). · 6cc80cfa
由 David S. Miller 提交于 2月 26, 2006
```
It's in "arg0" not "func".
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
6cc80cfa

[SPARC64]: Fix TLB context allocation with SMT style shared TLBs. · a0663a79

由 David S. Miller 提交于 2月 23, 2006

The context allocation scheme we use depends upon there being a 1<-->1
mapping from cpu to physical TLB for correctness.  Chips like Niagara
break this assumption.

So what we do is notify all cpus with a cross call when the context
version number changes, and if necessary this makes them allocate
a valid context for the address space they are running at the time.

Stress tested with make -j1024, make -j2048, and make -j4096 kernel
builds on a 32-strand, 8 core, T2000 with 16GB of ram.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0663a79

[SPARC64]: Kill cpudata->idle_volume. · 1bd0cd74

由 David S. Miller 提交于 2月 21, 2006

Set, but never used.

We used to use this for dynamic IRQ retargetting, but that
code died a long time ago.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1bd0cd74

D
[SPARC64]: Pass multiple CPUs at once to hypervisor cross-call API. · d371c0c1
由 David S. Miller 提交于 2月 21, 2006
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
d371c0c1

[SPARC64]: Get SUN4V SMP working. · 72aff53f

由 David S. Miller 提交于 2月 17, 2006

The sibling cpu bringup is extremely fragile.  We can only
perform the most basic calls until we take over the trap
table from the firmware/hypervisor on the new cpu.

This means no accesses to %g4, %g5, %g6 since those can't be
TLB translated without our trap handlers.

In order to achieve this:

1) Change sun4v_init_mondo_queues() so that it can operate in
   several modes.

   It can allocate the queues, or install them in the current
   processor, or both.

   The boot cpu does both in it's call early on.

   Later, the boot cpu allocates the sibling cpu queue, starts
   the sibling cpu, then the sibling cpu loads them in.

2) init_cur_cpu_trap() is changed to take the current_thread_info()
   as an argument instead of reading %g6 directly on the current
   cpu.

3) Create a trampoline stack for the sibling cpus.  We do our basic
   kernel calls using this stack, which is locked into the kernel
   image, then go to our proper thread stack after taking over the
   trap table.

4) While we are in this delicate startup state, we put 0xdeadbeef
   into %g4/%g5/%g6 in order to catch accidental accesses.

5) On the final prom_set_trap_table*() call, we put &init_thread_union
   into %g6.  This is a hack to make prom_world(0) work.  All that
   wants to do is restore the %asi register using
   get_thread_current_ds().

Longer term we should just do the OBP calls to set the trap table by
hand just like we do for everything else.  This would avoid that silly
prom_world(0) issue, then we can remove the init_thread_union hack.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72aff53f

[SPARC64]: Add prom_{start,stop}cpu_cpuid(). · 7890f794

由 David S. Miller 提交于 2月 15, 2006

Use prom_startcpu_cpuid() on SUN4V instead of prom_startcpu().

We should really test for "SUNW,start-cpu-by-cpuid" presence
and use it if present even on SUN4U.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7890f794

D
[SPARC64]: Use different cache sizing defaults on SUN4V. · f03b8a54
由 David S. Miller 提交于 2月 15, 2006
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
f03b8a54

[SPARC64]: Kill sun4v_register_fault_status() on SMP. · 4a07e646

由 David S. Miller 提交于 2月 14, 2006

That now gets done as a side effect of taking over the
trap table from OBP.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a07e646

D
[SPARC64]: Do not try to synchronize %stick registers on SUN4V. · 02fead75
由 David S. Miller 提交于 2月 11, 2006
```
Writes by privileged code are not allowed.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
02fead75

[SPARC64]: Fix mondo queue allocations. · b5a37e96

由 David S. Miller 提交于 2月 11, 2006

We have to use bootmem during init_IRQ and page alloc
for sibling cpu calls.

Also, fix incorrect hypervisor call return value
checks in the hypervisor SMP cpu mondo send code.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b5a37e96

[SPARC64]: Register kernel TSB with hypervisor. · 490384e7

由 David S. Miller 提交于 2月 11, 2006

We do this right after we take over the trap table from OBP.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

490384e7

D
[SPARC64]: Fix hypervisor call arg passing. · 164c220f
由 David S. Miller 提交于 2月 09, 2006
```
Function goes in %o5, args go in %o0 --> %o5.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
164c220f

[SPARC64]: Sun4v cross-call sending support. · 1d2f1f90

由 David S. Miller 提交于 2月 08, 2006

Technically the hypervisor call supports sending in a list
of all cpus to get the cross-call, but I only pass in one
cpu at a time for now.

The multi-cpu support is there, just ifdef'd out so it's easy to
enable or delete it later.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d2f1f90

D
[SPARC64]: Register per-cpu fault status area with sun4v hypervisor. · 481295f9
由 David S. Miller 提交于 2月 07, 2006
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
481295f9

[SPARC64]: Add some hypervisor tlb_type checks. · a43fe0e7

由 David S. Miller 提交于 2月 04, 2006

And more consistently check cheetah{,_plus} instead
of assuming anything not spitfire is cheetah{,_plus}.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a43fe0e7

[SPARC64]: Refine code sequences to get the cpu id. · 92704a1c

由 David S. Miller 提交于 2月 26, 2006

On uniprocessor, it's always zero for optimize that.

On SMP, the jmpl to the stub kills the return address stack in the cpu
branch prediction logic, so expand the code sequence inline and use a
code patching section to fix things up.  This also always better and
explicit register selection, which will be taken advantage of in a
future changeset.

The hard_smp_processor_id() function is big, so do not inline it.

Fix up tests for Jalapeno to also test for Serrano chips too.  These
tests want "jbus Ultra-IIIi" cases to match, so that is what we should
test for.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

92704a1c

[SPARC64]: Kill PROM locked TLB entry preservation code. · 3487d1d4

由 David S. Miller 提交于 1月 31, 2006

It is totally unnecessary complexity.  After we take over
the trap table, we handle all PROM tlb misses fully.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3487d1d4

[SPARC64]: Kill {save,restore}_alternate_globals() · 96c6e0d8

由 David S. Miller 提交于 1月 31, 2006

No longer needed now that we no longer have hard-coded
alternate global register usage.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96c6e0d8

[SPARC64]: Dynamically grow TSB in response to RSS growth. · bd40791e

由 David S. Miller 提交于 1月 31, 2006

As the RSS grows, grow the TSB in order to reduce the likelyhood
of hash collisions and thus poor hit rates in the TSB.

This definitely needs some serious tuning.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bd40791e

[SPARC64]: Elminate all usage of hard-coded trap globals. · 56fb4df6

由 David S. Miller 提交于 2月 26, 2006

UltraSPARC has special sets of global registers which are switched to
for certain trap types.  There is one set for MMU related traps, one
set of Interrupt Vector processing, and another set (called the
Alternate globals) for all other trap types.

For what seems like forever we've hard coded the values in some of
these trap registers.  Some examples include:

1) Interrupt Vector global %g6 holds current processors interrupt
   work struct where received interrupts are managed for IRQ handler
   dispatch.

2) MMU global %g7 holds the base of the page tables of the currently
   active address space.

3) Alternate global %g6 held the current_thread_info() value.

Such hardcoding has resulted in some serious issues in many areas.
There are some code sequences where having another register available
would help clean up the implementation.  Taking traps such as
cross-calls from the OBP firmware requires some trick code sequences
wherein we have to save away and restore all of the special sets of
global registers when we enter/exit OBP.

We were also using the IMMU TSB register on SMP to hold the per-cpu
area base address, which doesn't work any longer now that we actually
use the TSB facility of the cpu.

The implementation is pretty straight forward.  One tricky bit is
getting the current processor ID as that is different on different cpu
variants.  We use a stub with a fancy calling convention which we
patch at boot time.  The calling convention is that the stub is
branched to and the (PC - 4) to return to is in register %g1.  The cpu
number is left in %g6.  This stub can be invoked by using the
__GET_CPUID macro.

We use an array of per-cpu trap state to store the current thread and
physical address of the current address space's page tables.  The
TRAP_LOAD_THREAD_REG loads %g6 with the current thread from this
table, it uses __GET_CPUID and also clobbers %g1.

TRAP_LOAD_IRQ_WORK is used by the interrupt vector processing to load
the current processor's IRQ software state into %g6.  It also uses
__GET_CPUID and clobbers %g1.

Finally, TRAP_LOAD_PGD_PHYS loads the physical address base of the
current address space's page tables into %g7, it clobbers %g1 and uses
__GET_CPUID.

Many refinements are possible, as well as some tuning, with this stuff
in place.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56fb4df6

[SPARC64]: Move away from virtual page tables, part 1. · 74bf4312

由 David S. Miller 提交于 1月 31, 2006

We now use the TSB hardware assist features of the UltraSPARC
MMUs.

SMP is currently knowingly broken, we need to find another place
to store the per-cpu base pointers.  We hid them away in the TSB
base register, and that obviously will not work any more :-)

Another known broken case is non-8KB base page size.

Also noticed that flush_tlb_all() is not referenced anywhere, only
the internal __flush_tlb_all() (local cpu only) is used by the
sparc64 port, so we can get rid of flush_tlb_all().

The kernel gets it's own 8KB TSB (swapper_tsb) and each address space
gets it's own private 8K TSB.  Later we can add code to dynamically
increase the size of per-process TSB as the RSS grows.  An 8KB TSB is
good enough for up to about a 4MB RSS, after which the TSB starts to
incur many capacity and conflict misses.

We even accumulate OBP translations into the kernel TSB.

Another area for refinement is large page size support.  We could use
a secondary address space TSB to handle those.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74bf4312

27 2月, 2006 1 次提交

[SPARC64]: Make cpu_present_map available earlier. · 7abea921

由 David S. Miller 提交于 2月 25, 2006

The change to kernel/sched.c's init code to use for_each_cpu()
requires that the cpu_possible_map be setup much earlier.

Set it up via setup_arch(), constrained to NR_CPUS, and later
constrain it to max_cpus in smp_prepare_cpus().

This fixes SMP booting on sparc64.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7abea921

13 1月, 2006 1 次提交

[PATCH] sparc64: task_thread_info() · f3169641

由 Al Viro 提交于 1月 12, 2006

Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

f3169641

12 11月, 2005 1 次提交

[SPARC64]: Restore 2.4.x /proc/cpuinfo behavior for "ncpus probed" field. · 4d45cbac

由 David S. Miller 提交于 11月 11, 2005

Noticed by Tom 'spot' Callaway.

Even on uniprocessor we always reported the number of physical
cpus in the system via /proc/cpuinfo.  But when this got changed
to use num_possible_cpus() it always reads as "1" on uniprocessor.
This change was unintentional.

So scan the firmware device tree and count the number of cpu
nodes, and report that, as we always did.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d45cbac

09 11月, 2005 2 次提交

[PATCH] sched: resched and cpu_idle rework · 64c7c8f8

由 Nick Piggin 提交于 11月 08, 2005

Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce
confusion, and make their semantics rigid.  Improves efficiency of
resched_task and some cpu_idle routines.

* In resched_task:
- TIF_NEED_RESCHED is only cleared with the task's runqueue lock held,
  and as we hold it during resched_task, then there is no need for an
  atomic test and set there. The only other time this should be set is
  when the task's quantum expires, in the timer interrupt - this is
  protected against because the rq lock is irq-safe.

- If TIF_NEED_RESCHED is set, then we don't need to do anything. It
  won't get unset until the task get's schedule()d off.

- If we are running on the same CPU as the task we resched, then set
  TIF_NEED_RESCHED and no further action is required.

- If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set
  after TIF_NEED_RESCHED has been set, then we need to send an IPI.

Using these rules, we are able to remove the test and set operation in
resched_task, and make clear the previously vague semantics of
POLLING_NRFLAG.

* In idle routines:
- Enter cpu_idle with preempt disabled. When the need_resched() condition
  becomes true, explicitly call schedule(). This makes things a bit clearer
  (IMO), but haven't updated all architectures yet.

- Many do a test and clear of TIF_NEED_RESCHED for some reason. According
  to the resched_task rules, this isn't needed (and actually breaks the
  assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock
  held). So remove that. Generally one less locked memory op when switching
  to the idle thread.

- Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner
  most polling idle loops. The above resched_task semantics allow it to be
  set until before the last time need_resched() is checked before going into
  a halt requiring interrupt wakeup.

  Many idle routines simply never enter such a halt, and so POLLING_NRFLAG
  can be always left set, completely eliminating resched IPIs when rescheduling
  the idle task.

  POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Con Kolivas <kernel@kolivas.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

64c7c8f8

[PATCH] sched: disable preempt in idle tasks · 5bfb5d69

由 Nick Piggin 提交于 11月 08, 2005

Run idle threads with preempt disabled.

Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()).
How did it ever work before?

Might fix the CPU hotplugging hang which Nigel Cunningham noted.

We think the bug hits if the idle thread is preempted after checking
need_resched() and before going to sleep, then the CPU offlined.

After calling stop_machine_run, the CPU eventually returns from preemption and
into the idle thread and goes to sleep.  The CPU will continue executing
previous idle and have no chance to call play_dead.

By disabling preemption until we are ready to explicitly schedule, this bug is
fixed and the idle threads generally become more robust.

From: alexs <ashepard@u.washington.edu>

  PPC build fix

From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>

  MIPS build fix
Signed-off-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NYoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

5bfb5d69

08 11月, 2005 2 次提交

[SPARC64] mm: Do not flush TLB mm in tlb_finish_mmu() · 62dbec78

由 David S. Miller 提交于 11月 07, 2005

It isn't needed any longer, as noted by Hugh Dickins.

We still need the flush routines, due to the one remaining
call site in hugetlb_prefault_arch_hook().  That can be
eliminated at some later point, however.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

62dbec78

[SPARC64] mm: context switch ptlock · dedeb002

由 Hugh Dickins 提交于 11月 07, 2005

sparc64 is unique among architectures in taking the page_table_lock in
its context switch (well, cris does too, but erroneously, and it's not
yet SMP anyway).

This seems to be a private affair between switch_mm and activate_mm,
using page_table_lock as a per-mm lock, without any relation to its uses
elsewhere. That's fine, but comment it as such; and unlock sooner in
switch_mm, more like in activate_mm (preemption is disabled here).

There is a block of "if (0)"ed code in smp_flush_tlb_pending which would
have liked to rely on the page_table_lock, in switch_mm and elsewhere;
but its comment explains how dup_mmap's flush_tlb_mm defeated it. And
though that could have been changed at any time over the past few years,
now the chance vanishes as we push the page_table_lock downwards, and
perhaps split it per page table page. Just delete that block of code.

Which leaves the mysterious spin_unlock_wait(&oldmm->page_table_lock)
in kernel/fork.c copy_mm. Textual analysis (supported by Nick Piggin)
suggests that the comment was written by DaveM, and that it relates to
the defeated approach in the sparc64 smp_flush_tlb_pending. Just delete
this block too.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dedeb002

15 10月, 2005 1 次提交

[SPARC64]: Fix powering off on SMP. · b4d1b825

由 David S. Miller 提交于 10月 14, 2005

Doing a "SUNW,stop-self" firmware call on the other cpus is not the
correct thing to do when dropping into the firmware for a halt,
reboot, or power-off.

For now, just do nothing to quiet the other cpus, as the system should
be quiescent enough.  Later we may decide to implement smp_send_stop()
like the other SMP platforms do.

Based upon a report from Christopher Zimmermann.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b4d1b825

26 9月, 2005 1 次提交

[SPARC64]: Probe D/I/E-cache config and use. · 80dc0d6b

由 David S. Miller 提交于 9月 26, 2005

At boot time, determine the D-cache, I-cache and E-cache size and
line-size.  Use them in cache flushes when appropriate.

This change was motivated by discovering that the D-cache on
UltraSparc-IIIi and later are 64K not 32K, and the flushes done by the
Cheetah error handlers were assuming a 32K size.

There are still some pieces of code that are hard coding things and
will need to be fixed up at some point.

While we're here, fix the D-cache and I-cache parity error handlers
to run with interrupts disabled, and when the trap occurs at trap
level > 1 log the event via a counter displayed in /proc/cpuinfo.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80dc0d6b

30 8月, 2005 1 次提交

[SPARC64]: More fully work around Spitfire Errata 51. · 4f07118f

由 David S. Miller 提交于 8月 29, 2005

It appears that a memory barrier soon after a mispredicted
branch, not just in the delay slot, can cause the hang
condition of this cpu errata.

So move them out-of-line, and explicitly put them into
a "branch always, predict taken" delay slot which should
fully kill this problem.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f07118f

25 7月, 2005 1 次提交

[SPARC64]: Move syscall success and newchild state out of thread flags. · db7d9a4e

由 David S. Miller 提交于 7月 24, 2005

These two bits were accesses non-atomically from assembler
code.  So, in order to eliminate any potential races resulting
from that, move these pieces of state into two bytes elsewhere
in struct thread_info.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db7d9a4e

13 7月, 2005 1 次提交

[SPARC64]: Fix SMP build failure. · c12a8289

由 Andrew Morton 提交于 7月 12, 2005

arch/sparc64/kernel/smp.c:48: error: parse error before "__attribute__"
arch/sparc64/kernel/smp.c:49: error: parse error before "__attribute__"
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c12a8289

11 7月, 2005 1 次提交
- D
  [SPARC64]: Add __read_mostly support. · d369ddd2
  由 David S. Miller 提交于 7月 10, 2005
```
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
```
  d369ddd2
24 5月, 2005 1 次提交

[SPARC64]: Add boot option to force UltraSPARC-III P-Cache on. · 816242da

由 David S. Miller 提交于 5月 23, 2005

Older UltraSPARC-III chips have a P-Cache bug that makes us disable it
by default at boot time.

However, this does hurt performance substantially, particularly with
memcpy(), and the bug is _incredibly_ obscure.  I have never seen it
triggered in practice, ever.

So provide a "-P" boot option that forces the P-Cache on.  It taints
the kernel, so if it does trigger and cause some data corruption or
OOPS, we will find out in the logs that this option was on when it
happened.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

816242da

17 4月, 2005 1 次提交

Linux-2.6.12-rc2 · 1da177e4

由 Linus Torvalds 提交于 4月 16, 2005

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

1da177e4

Linux-御风守护者 / linux 与 Fork 源项目一致

Linux-御风守护者 / linux
与 Fork 源项目一致