1. 05 10月, 2007 1 次提交
  2. 09 8月, 2007 1 次提交
  3. 16 7月, 2007 7 次提交
    • D
      [SPARC64]: dr-cpu unconfigure support. · e0204409
      David S. Miller 提交于
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e0204409
    • D
      [SPARC64]: Clear cpu_{core,sibling}_map[] in smp_fill_in_sib_core_maps() · 39dd992a
      David S. Miller 提交于
      When we hot-plug in new cpus, the core_id and proc_id of existing
      cpus can change.  So in order to set the cpu groups correctly we
      need to clear the maps out completely first.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39dd992a
    • D
      b37d40d1
    • D
      [SPARC64]: More sensible udelay implementation. · 8b99cfb8
      David S. Miller 提交于
      Take a page from the powerpc folks and just calculate the
      delay factor directly.
      
      Since frequency scaling chips use a system-tick register,
      the value is going to be the same system-wide.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b99cfb8
    • D
      [SPARC64]: SMP build fixes. · 27a2ef38
      David S. Miller 提交于
      With the move of ldom_startcpu_cpuid() into smp.c some other
      things need to follow along:
      
      1) smp.c is not a driver so we can't use "PFX" macro in the
         printk calls.
      
      2) smp.c now needs asm/io.h and asm/hvtramp.h, ds.c no longer
         does
      
      3) kimage_addr_to_ra() also needs to move into smp.c
      
      While we're here, update copyright info and my email address
      in smp.c
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27a2ef38
    • D
      [SPARC64]: Fix build regressions added by dr-cpu changes. · b14f5c10
      David S. Miller 提交于
      Do not select HOTPLUG_CPU from SUN_LDOMS, that causes
      HOTPLUG_CPU to be selected even on non-SMP which is
      illegal.
      
      Only build hvtramp.o when SMP, just like trampoline.o
      
      Protect dr-cpu code in ds.c with HOTPLUG_CPU.
      
      Likewise move ldom_startcpu_cpuid() to smp.c and protect
      it and the call site with SUN_LDOMS && HOTPLUG_CPU.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b14f5c10
    • D
      [SPARC64]: Initial LDOM cpu hotplug support. · 4f0234f4
      David S. Miller 提交于
      Only adding cpus is supports at the moment, removal
      will come next.
      
      When new cpus are configured, the machine description is
      updated.  When we get the configure request we pass in a
      cpu mask of to-be-added cpus to the mdesc CPU node parser
      so it only fetches information for those cpus.  That code
      also proceeds to update the SMT/multi-core scheduling bitmaps.
      
      cpu_up() does all the work and we return the status back
      over the DS channel.
      
      CPUs via dr-cpu need to be booted straight out of the
      hypervisor, and this requires:
      
      1) A new trampoline mechanism.  CPUs are booted straight
         out of the hypervisor with MMU disabled and running in
         physical addresses with no mappings installed in the TLB.
      
         The new hvtramp.S code sets up the critical cpu state,
         installs the locked TLB mappings for the kernel, and
         turns the MMU on.  It then proceeds to follow the logic
         of the existing trampoline.S SMP cpu bringup code.
      
      2) All calls into OBP have to be disallowed when domaining
         is enabled.  Since cpus boot straight into the kernel from
         the hypervisor, OBP has no state about that cpu and therefore
         cannot handle being invoked on that cpu.
      
         Luckily it's only a handful of interfaces which can be called
         after the OBP device tree is obtained.  For example, rebooting,
         halting, powering-off, and setting options node variables.
      
      CPU removal support will require some infrastructure changes
      here.  Namely we'll have to process the requests via a true
      kernel thread instead of in a workqueue.  workqueues run on
      a per-cpu thread, but when unconfiguring we might need to
      force the thread to execute on another cpu if the current cpu
      is the one being removed.  Removal of a cpu also causes the kernel
      to destroy that cpu's workqueue running thread.
      
      Another issue on removal is that we may have interrupts still
      pointing to the cpu-to-be-removed.  So new code will be needed
      to walk the active INO list and retarget those cpus as-needed.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f0234f4
  4. 10 7月, 2007 1 次提交
    • I
      sched: zap the migration init / cache-hot balancing code · 0437e109
      Ingo Molnar 提交于
      the SMP load-balancer uses the boot-time migration-cost estimation
      code to attempt to improve the quality of balancing. The reason for
      this code is that the discrete priority queues do not preserve
      the order of scheduling accurately, so the load-balancer skips
      tasks that were running on a CPU 'recently'.
      
      this code is fundamental fragile: the boot-time migration cost detector
      doesnt really work on systems that had large L3 caches, it caused boot
      delays on large systems and the whole cache-hot concept made the
      balancing code pretty undeterministic as well.
      
      (and hey, i wrote most of it, so i can say it out loud that it sucks ;-)
      
      under CFS the same purpose of cache affinity can be achieved without
      any special cache-hot special-case: tasks are sorted in the 'timeline'
      tree and the SMP balancer picks tasks from the left side of the
      tree, thus the most cache-cold task is balanced automatically.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0437e109
  5. 05 6月, 2007 2 次提交
  6. 29 5月, 2007 2 次提交
    • D
      [SPARC64]: Eliminate NR_CPUS limitations. · 22adb358
      David S. Miller 提交于
      Cheetah systems can have cpuids as large as 1023, although physical
      systems don't have that many cpus.
      
      Only three limitations existed in the kernel preventing arbitrary
      NR_CPUS values:
      
      1) dcache dirty cpu state stored in page->flags on
         D-cache aliasing platforms.  With some build time
         calculations and some build-time BUG checks on
         page->flags layout, this one was easily solved.
      
      2) The cheetah XCALL delivery code could only handle
         a cpumask with up to 32 cpus set.  Some simple looping
         logic clears that up too.
      
      3) thread_info->cpu was a u8, easily changed to a u16.
      
      There are a few spots in the kernel that still put NR_CPUS
      sized arrays on the kernel stack, but that's not a sparc64
      specific problem.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22adb358
    • D
  7. 14 5月, 2007 1 次提交
  8. 09 5月, 2007 1 次提交
  9. 03 5月, 2007 1 次提交
  10. 26 4月, 2007 2 次提交
    • D
      [SPARC64]: Add clocksource/clockevents support. · 112f4871
      David S. Miller 提交于
      I'd like to thank John Stul and others for helping
      me along the way.
      
      A lot of cleanups fell out of this.  For example, the get_compare()
      tick_op was totally unused, so was deleted.  And the most often used
      tick_op members were grouped together for cache-friendlyness.
      
      The sparc64 TSC is given to the kernel as a one-shot timer.
      
      tick_ops->init_timer() simply turns off the privileged bit in
      the tick register (when possible), and disables the interrupt
      by setting bit 63 in the compare register.  The ->disable_irq()
      op also sets this bit.
      
      tick_ops->add_compare() is changed to:
      
      1) Add the given delta to "tick" not to "compare"
      2) Return a boolean which, if true, means that the tick
         value read after writing the compare value was found
         to have incremented past the initial tick value.  This
         mirrors logic used in the HPET driver's ->next_event()
         method.
      
      Each tick_ops implementation also now provides a name string.
      And we feed this into the clocksource and clockevents layers.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      112f4871
    • D
      [SPARC64]: Unify timer interrupt handler. · 777a4475
      David S. Miller 提交于
      Things were scattered all over the place, split between
      SMP and non-SMP.
      
      Unify it all so that dyntick support is easier to add.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      777a4475
  11. 12 1月, 2007 1 次提交
    • G
      [PATCH] Change cpu_up and co from __devinit to __cpuinit · b282b6f8
      Gautham R Shenoy 提交于
      Compiling the kernel with CONFIG_HOTPLUG = y and CONFIG_HOTPLUG_CPU = n
      with CONFIG_RELOCATABLE = y generates the following modpost warnings
      
      WARNING: vmlinux - Section mismatch: reference to .init.data: from
      .text between '_cpu_up' (at offset 0xc0141b7d) and 'cpu_up'
      WARNING: vmlinux - Section mismatch: reference to .init.data: from
      .text between '_cpu_up' (at offset 0xc0141b9c) and 'cpu_up'
      WARNING: vmlinux - Section mismatch: reference to .init.text:__cpu_up
      from .text between '_cpu_up' (at offset 0xc0141bd8) and 'cpu_up'
      WARNING: vmlinux - Section mismatch: reference to .init.data: from
      .text between '_cpu_up' (at offset 0xc0141c05) and 'cpu_up'
      WARNING: vmlinux - Section mismatch: reference to .init.data: from
      .text between '_cpu_up' (at offset 0xc0141c26) and 'cpu_up'
      WARNING: vmlinux - Section mismatch: reference to .init.data: from
      .text between '_cpu_up' (at offset 0xc0141c37) and 'cpu_up'
      
      This is because cpu_up, _cpu_up and __cpu_up (in some architectures) are
      defined as __devinit
      AND
      __cpu_up calls some __cpuinit functions.
      
      Since __cpuinit would map to __init with this kind of a configuration,
      we get a .text refering .init.data warning.
      
      This patch solves the problem by converting all of __cpu_up, _cpu_up
      and cpu_up from __devinit to __cpuinit. The approach is justified since
      the callers of cpu_up are either dependent on CONFIG_HOTPLUG_CPU or
      are of __init type.
      
      Thus when CONFIG_HOTPLUG_CPU=y, all these cpu up functions would land up
      in .text section, and when CONFIG_HOTPLUG_CPU=n, all these functions would
      land up in .init section.
      
      Tested on a i386 SMP machine running linux-2.6.20-rc3-mm1.
      Signed-off-by: NGautham R Shenoy <ego@in.ibm.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      b282b6f8
  12. 18 12月, 2006 1 次提交
  13. 09 10月, 2006 1 次提交
  14. 24 6月, 2006 1 次提交
  15. 11 6月, 2006 1 次提交
  16. 31 5月, 2006 1 次提交
  17. 11 4月, 2006 1 次提交
  18. 10 4月, 2006 1 次提交
  19. 01 4月, 2006 1 次提交
    • D
      [SPARC64]: Make tsb_sync() mm comparison more precise. · 6f25f398
      David S. Miller 提交于
      switch_mm() changes the mm state and does a tsb_context_switch()
      first, then we do the cpu register state switch which changes
      current_thread_info() and current().
      
      So it's safer to check the PGD physical address stored in the
      trap block (which will be updated by the tsb_context_switch() in
      switch_mm()) than current->active_mm.
      
      Technically we should never run here in between those two
      updates, because interrupts are disabled during the entire
      context switch operation.  But some day we might like to leave
      interrupts enabled during the context switch and this change
      allows that to happen without any surprises.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6f25f398
  20. 26 3月, 2006 1 次提交
  21. 23 3月, 2006 1 次提交
    • A
      [PATCH] more for_each_cpu() conversions · 394e3902
      Andrew Morton 提交于
      When we stop allocating percpu memory for not-possible CPUs we must not touch
      the percpu data for not-possible CPUs at all.  The correct way of doing this
      is to test cpu_possible() or to use for_each_cpu().
      
      This patch is a kernel-wide sweep of all instances of NR_CPUS.  I found very
      few instances of this bug, if any.  But the patch converts lots of open-coded
      test to use the preferred helper macros.
      
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Acked-by: NKyle McMartin <kyle@parisc-linux.org>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Cc: Andi Kleen <ak@muc.de>
      Cc: Christian Zankel <chris@zankel.net>
      Cc: Philippe Elie <phil.el@wanadoo.fr>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Jens Axboe <axboe@suse.de>
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      394e3902
  22. 20 3月, 2006 10 次提交
    • D
      [SPARC64]: Add SMT scheduling support for Niagara. · 8935dced
      David S. Miller 提交于
      The mapping is a simple "(cpuid >> 2) == core" for now.
      Later we'll add more sophisticated code that will walk
      the sun4v machine description and figure this out from
      there.
      
      We should also add core mappings for jaguar and panther
      processors.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8935dced
    • D
      [SPARC64]: Fix new context version SMP handling. · ee29074d
      David S. Miller 提交于
      Don't piggy back the SMP receive signal code to do the
      context version change handling.
      
      Instead allocate another fixed PIL number for this
      asynchronous cross-call.  We can't use smp_call_function()
      because this thing is invoked with interrupts disabled
      and a few spinlocks held.
      
      Also, fix smp_call_function_mask() to count "cpus" correctly.
      There is no guarentee that the local cpu is in the mask
      yet that is exactly what this code was assuming.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee29074d
    • D
      [SPARC64]: More SUN4V cpu mondo bug fixing. · 3cab0c3e
      David S. Miller 提交于
      This cpu mondo sending interface isn't all that easy to
      use correctly...
      
      We were clearing out the wrong bits from the "mask" after getting
      something other than EOK from the hypervisor.
      
      It turns out the hypervisor can just be resent the same cpu_list[]
      array, with the 0xffff "done" entries still in there, and it will do
      the right thing.
      
      So don't update or try to rebuild the cpu_list[] array to condense it.
      
      This requires the "forward_progress" check to be done slightly
      differently, but this new scheme is less bug prone than what we were
      doing before.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3cab0c3e
    • D
      [SPARC64]: Fix bugs in SUN4V cpu mondo dispatch. · b830ab66
      David S. Miller 提交于
      There were several bugs in the SUN4V cpu mondo dispatch code.
      
      In fact, if we ever got a EWOULDBLOCK or other error from
      the hypervisor call, we'd potentially send a cpu mondo multiple
      times to the same cpu and even worse we could loop until the
      timeout resending the same mondo over and over to such cpus.
      
      So let's bulletproof this thing as follows:
      
      1) Implement cpu_mondo_send() and cpu_state() hypervisor calls
         in arch/sparc64/kernel/entry.S, add prototypes to asm/hypervisor.h
      
      2) Don't build and update the cpulist using inline functions, this
         was causing the cpu mask to not get updated in the caller.
      
      3) Disable interrupts during the entire mondo send, otherwise our
         cpu list and/or mondo block could get overwritten if we take
         an interrupt and do a cpu mondo send on the current cpu.
      
      4) Check for all possible error return types from the cpu_mondo_send()
         hypervisor call.  In particular:
      
         HV_EOK) Our work is done, all cpus have received the mondo.
         HV_CPUERROR) One or more of the cpus in the cpu list we passed
                      to the hypervisor are in error state.  Use cpu_state()
                      calls over the entries in the cpu list to see which
      		ones.  Record them in "error_mask" and report this
      		after we are done sending the mondo to cpus which are
      		not in error state.
         HV_EWOULDBLOCK) We need to keep trying.
      
         Any other error we consider fatal, we report the event and exit
         immediately.
      
      5) We only timeout if forward progress is not made.  Forward progress
         is defined as having at least one cpu get the mondo successfully
         in a given cpu_mondo_send() call.  Otherwise we bump a counter
         and delay a little.  If the counter hits a limit, we signal an
         error and report the event.
      
      Also, smp_call_function_mask() error handling reports the number
      of cpus incorrectly.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b830ab66
    • D
      [SPARC64]: Fix bugs in SMP TLB context version expiration handling. · aac0aadf
      David S. Miller 提交于
      1) We must flush the TLB, duh.
      
      2) Even if the sw context was seen to be valid, the local cpu's
         hw context can be out of date, so reload it unconditionally.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      aac0aadf
    • D
      [SPARC64]: Report mondo error correctly in hypervisor_xcall_deliver(). · 6cc80cfa
      David S. Miller 提交于
      It's in "arg0" not "func".
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6cc80cfa
    • D
      [SPARC64]: Fix TLB context allocation with SMT style shared TLBs. · a0663a79
      David S. Miller 提交于
      The context allocation scheme we use depends upon there being a 1<-->1
      mapping from cpu to physical TLB for correctness.  Chips like Niagara
      break this assumption.
      
      So what we do is notify all cpus with a cross call when the context
      version number changes, and if necessary this makes them allocate
      a valid context for the address space they are running at the time.
      
      Stress tested with make -j1024, make -j2048, and make -j4096 kernel
      builds on a 32-strand, 8 core, T2000 with 16GB of ram.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a0663a79
    • D
      [SPARC64]: Kill cpudata->idle_volume. · 1bd0cd74
      David S. Miller 提交于
      Set, but never used.
      
      We used to use this for dynamic IRQ retargetting, but that
      code died a long time ago.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1bd0cd74
    • D
    • D
      [SPARC64]: Get SUN4V SMP working. · 72aff53f
      David S. Miller 提交于
      The sibling cpu bringup is extremely fragile.  We can only
      perform the most basic calls until we take over the trap
      table from the firmware/hypervisor on the new cpu.
      
      This means no accesses to %g4, %g5, %g6 since those can't be
      TLB translated without our trap handlers.
      
      In order to achieve this:
      
      1) Change sun4v_init_mondo_queues() so that it can operate in
         several modes.
      
         It can allocate the queues, or install them in the current
         processor, or both.
      
         The boot cpu does both in it's call early on.
      
         Later, the boot cpu allocates the sibling cpu queue, starts
         the sibling cpu, then the sibling cpu loads them in.
      
      2) init_cur_cpu_trap() is changed to take the current_thread_info()
         as an argument instead of reading %g6 directly on the current
         cpu.
      
      3) Create a trampoline stack for the sibling cpus.  We do our basic
         kernel calls using this stack, which is locked into the kernel
         image, then go to our proper thread stack after taking over the
         trap table.
      
      4) While we are in this delicate startup state, we put 0xdeadbeef
         into %g4/%g5/%g6 in order to catch accidental accesses.
      
      5) On the final prom_set_trap_table*() call, we put &init_thread_union
         into %g6.  This is a hack to make prom_world(0) work.  All that
         wants to do is restore the %asi register using
         get_thread_current_ds().
      
      Longer term we should just do the OBP calls to set the trap table by
      hand just like we do for everything else.  This would avoid that silly
      prom_world(0) issue, then we can remove the init_thread_union hack.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72aff53f