1. 16 7月, 2007 1 次提交
    • D
      [SPARC64]: Initial LDOM cpu hotplug support. · 4f0234f4
      David S. Miller 提交于
      Only adding cpus is supports at the moment, removal
      will come next.
      
      When new cpus are configured, the machine description is
      updated.  When we get the configure request we pass in a
      cpu mask of to-be-added cpus to the mdesc CPU node parser
      so it only fetches information for those cpus.  That code
      also proceeds to update the SMT/multi-core scheduling bitmaps.
      
      cpu_up() does all the work and we return the status back
      over the DS channel.
      
      CPUs via dr-cpu need to be booted straight out of the
      hypervisor, and this requires:
      
      1) A new trampoline mechanism.  CPUs are booted straight
         out of the hypervisor with MMU disabled and running in
         physical addresses with no mappings installed in the TLB.
      
         The new hvtramp.S code sets up the critical cpu state,
         installs the locked TLB mappings for the kernel, and
         turns the MMU on.  It then proceeds to follow the logic
         of the existing trampoline.S SMP cpu bringup code.
      
      2) All calls into OBP have to be disallowed when domaining
         is enabled.  Since cpus boot straight into the kernel from
         the hypervisor, OBP has no state about that cpu and therefore
         cannot handle being invoked on that cpu.
      
         Luckily it's only a handful of interfaces which can be called
         after the OBP device tree is obtained.  For example, rebooting,
         halting, powering-off, and setting options node variables.
      
      CPU removal support will require some infrastructure changes
      here.  Namely we'll have to process the requests via a true
      kernel thread instead of in a workqueue.  workqueues run on
      a per-cpu thread, but when unconfiguring we might need to
      force the thread to execute on another cpu if the current cpu
      is the one being removed.  Removal of a cpu also causes the kernel
      to destroy that cpu's workqueue running thread.
      
      Another issue on removal is that we may have interrupts still
      pointing to the cpu-to-be-removed.  So new code will be needed
      to walk the active INO list and retarget those cpus as-needed.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f0234f4
  2. 05 6月, 2007 1 次提交
  3. 29 5月, 2007 2 次提交
    • D
      [SPARC64]: Eliminate NR_CPUS limitations. · 22adb358
      David S. Miller 提交于
      Cheetah systems can have cpuids as large as 1023, although physical
      systems don't have that many cpus.
      
      Only three limitations existed in the kernel preventing arbitrary
      NR_CPUS values:
      
      1) dcache dirty cpu state stored in page->flags on
         D-cache aliasing platforms.  With some build time
         calculations and some build-time BUG checks on
         page->flags layout, this one was easily solved.
      
      2) The cheetah XCALL delivery code could only handle
         a cpumask with up to 32 cpus set.  Some simple looping
         logic clears that up too.
      
      3) thread_info->cpu was a u8, easily changed to a u16.
      
      There are a few spots in the kernel that still put NR_CPUS
      sized arrays on the kernel stack, but that's not a sparc64
      specific problem.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22adb358
    • D
  4. 26 4月, 2007 1 次提交
  5. 20 6月, 2006 1 次提交
    • D
      [SPARC64]: Send all device interrupts via one PIL. · fd0504c3
      David S. Miller 提交于
      This is the first in a series of cleanups that will hopefully
      allow a seamless attempt at using the generic IRQ handling
      infrastructure in the Linux kernel.
      
      Define PIL_DEVICE_IRQ and vector all device interrupts through
      there.
      
      Get rid of the ugly pil0_dummy_{bucket,desc}, instead vector
      the timer interrupt directly to a specific handler since the
      timer interrupt is the only event that will be signaled on
      PIL 14.
      
      The irq_worklist is now in the per-cpu trap_block[].
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fd0504c3
  6. 22 3月, 2006 1 次提交
  7. 20 3月, 2006 20 次提交
    • D
      [SPARC64]: Kill cpudata->idle_volume. · 1bd0cd74
      David S. Miller 提交于
      Set, but never used.
      
      We used to use this for dynamic IRQ retargetting, but that
      code died a long time ago.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1bd0cd74
    • D
      [SPARC64]: Fix uniprocessor IRQ targetting on SUN4V. · ebd8c56c
      David S. Miller 提交于
      We need to use the real hardware processor ID when
      targetting interrupts, not the "define to 0" thing
      the uniprocessor build gives us.
      
      Also, fill in the Node-ID and Agent-ID fields properly
      on sun4u/Safari.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ebd8c56c
    • D
      [SPARC64]: Get SUN4V SMP working. · 72aff53f
      David S. Miller 提交于
      The sibling cpu bringup is extremely fragile.  We can only
      perform the most basic calls until we take over the trap
      table from the firmware/hypervisor on the new cpu.
      
      This means no accesses to %g4, %g5, %g6 since those can't be
      TLB translated without our trap handlers.
      
      In order to achieve this:
      
      1) Change sun4v_init_mondo_queues() so that it can operate in
         several modes.
      
         It can allocate the queues, or install them in the current
         processor, or both.
      
         The boot cpu does both in it's call early on.
      
         Later, the boot cpu allocates the sibling cpu queue, starts
         the sibling cpu, then the sibling cpu loads them in.
      
      2) init_cur_cpu_trap() is changed to take the current_thread_info()
         as an argument instead of reading %g6 directly on the current
         cpu.
      
      3) Create a trampoline stack for the sibling cpus.  We do our basic
         kernel calls using this stack, which is locked into the kernel
         image, then go to our proper thread stack after taking over the
         trap table.
      
      4) While we are in this delicate startup state, we put 0xdeadbeef
         into %g4/%g5/%g6 in order to catch accidental accesses.
      
      5) On the final prom_set_trap_table*() call, we put &init_thread_union
         into %g6.  This is a hack to make prom_world(0) work.  All that
         wants to do is restore the %asi register using
         get_thread_current_ds().
      
      Longer term we should just do the OBP calls to set the trap table by
      hand just like we do for everything else.  This would avoid that silly
      prom_world(0) issue, then we can remove the init_thread_union hack.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      72aff53f
    • D
      [SPARC64]: Use ASI_SCRATCHPAD address 0x0 properly. · 12eaa328
      David S. Miller 提交于
      This is where the virtual address of the fault status
      area belongs.
      
      To set it up we don't make a hypervisor call, instead
      we call OBP's SUNW,set-trap-table with the real address
      of the fault status area as the second argument.  And
      right before that call we write the virtual address into
      ASI_SCRATCHPAD vaddr 0x0.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12eaa328
    • D
      [SPARC64]: Sun4v cross-call sending support. · 1d2f1f90
      David S. Miller 提交于
      Technically the hypervisor call supports sending in a list
      of all cpus to get the cross-call, but I only pass in one
      cpu at a time for now.
      
      The multi-cpu support is there, just ifdef'd out so it's easy to
      enable or delete it later.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1d2f1f90
    • D
      [SPARC64]: Sun4v interrupt handling. · 5b0c0572
      David S. Miller 提交于
      Sun4v has 4 interrupt queues: cpu, device, resumable errors,
      and non-resumable errors.  A set of head/tail offset pointers
      help maintain a work queue in physical memory.  The entries
      are 64-bytes in size.
      
      Each queue is allocated then registered with the hypervisor
      as we bring cpus up.
      
      The two error queues each get a kernel side buffer that we
      use to quickly empty the main interrupt queue before we
      call up to C code to log the event and possibly take evasive
      action.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b0c0572
    • D
      [SPARC64]: Add sun4v mondo queue bases to struct trap_per_cpu. · 7202c55c
      David S. Miller 提交于
      Also, correct TRAP_PER_CPU_FAULT_INFO define, it should be
      0x40 not 0x20.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7202c55c
    • D
      [SPARC64]: asm/cpudata.h needs asm/asi.h · 89a5264f
      David S. Miller 提交于
      For the expansion of __GET_CPUID() on SMP.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89a5264f
    • D
    • D
      [SPARC64]: Initial sun4v TLB miss handling infrastructure. · d257d5da
      David S. Miller 提交于
      Things are a little tricky because, unlike sun4u, we have
      to:
      
      1) do a hypervisor trap to do the TLB load.
      2) do the TSB lookup calculations by hand
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d257d5da
    • D
      [SPARC64]: Sanitize %pstate writes for sun4v. · 45fec05f
      David S. Miller 提交于
      If we're just switching between different alternate global
      sets, nop it out on sun4v.  Also, get rid of all of the
      alternate global save/restore in the OBP CIF trampoline code.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      45fec05f
    • D
      [SPARC64]: Add initial code to twiddle %gl on trap entry/exit. · 936f482a
      David S. Miller 提交于
      Instead of setting/clearing PSTATE_AG we have to change
      the %gl register value on sun4v.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      936f482a
    • D
      d96b8153
    • D
      [SPARC64]: Add explicit register args to trap state loading macros. · ffe483d5
      David S. Miller 提交于
      This, as well as making the code cleaner, allows a simplification in
      the TSB miss handling path.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ffe483d5
    • D
      [SPARC64]: Refine code sequences to get the cpu id. · 92704a1c
      David S. Miller 提交于
      On uniprocessor, it's always zero for optimize that.
      
      On SMP, the jmpl to the stub kills the return address stack in the cpu
      branch prediction logic, so expand the code sequence inline and use a
      code patching section to fix things up.  This also always better and
      explicit register selection, which will be taken advantage of in a
      future changeset.
      
      The hard_smp_processor_id() function is big, so do not inline it.
      
      Fix up tests for Jalapeno to also test for Serrano chips too.  These
      tests want "jbus Ultra-IIIi" cases to match, so that is what we should
      test for.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      92704a1c
    • D
      [SPARC64]: Fix race in LOAD_PER_CPU_BASE() · 86b81868
      David S. Miller 提交于
      Since we use %g5 itself as a temporary, it can get clobbered
      if we take an interrupt mid-stream and thus cause end up with
      the final %g5 value too early as a result of rtrap processing.
      
      Set %g5 at the very end, atomically, to avoid this problem.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      86b81868
    • D
      [SPARC64]: Kill sole argument passed to setup_tba(). · a8b900d8
      David S. Miller 提交于
      No longer used, and move extern declaration to a header file.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8b900d8
    • D
      [SPARC64]: Elminate all usage of hard-coded trap globals. · 56fb4df6
      David S. Miller 提交于
      UltraSPARC has special sets of global registers which are switched to
      for certain trap types.  There is one set for MMU related traps, one
      set of Interrupt Vector processing, and another set (called the
      Alternate globals) for all other trap types.
      
      For what seems like forever we've hard coded the values in some of
      these trap registers.  Some examples include:
      
      1) Interrupt Vector global %g6 holds current processors interrupt
         work struct where received interrupts are managed for IRQ handler
         dispatch.
      
      2) MMU global %g7 holds the base of the page tables of the currently
         active address space.
      
      3) Alternate global %g6 held the current_thread_info() value.
      
      Such hardcoding has resulted in some serious issues in many areas.
      There are some code sequences where having another register available
      would help clean up the implementation.  Taking traps such as
      cross-calls from the OBP firmware requires some trick code sequences
      wherein we have to save away and restore all of the special sets of
      global registers when we enter/exit OBP.
      
      We were also using the IMMU TSB register on SMP to hold the per-cpu
      area base address, which doesn't work any longer now that we actually
      use the TSB facility of the cpu.
      
      The implementation is pretty straight forward.  One tricky bit is
      getting the current processor ID as that is different on different cpu
      variants.  We use a stub with a fancy calling convention which we
      patch at boot time.  The calling convention is that the stub is
      branched to and the (PC - 4) to return to is in register %g1.  The cpu
      number is left in %g6.  This stub can be invoked by using the
      __GET_CPUID macro.
      
      We use an array of per-cpu trap state to store the current thread and
      physical address of the current address space's page tables.  The
      TRAP_LOAD_THREAD_REG loads %g6 with the current thread from this
      table, it uses __GET_CPUID and also clobbers %g1.
      
      TRAP_LOAD_IRQ_WORK is used by the interrupt vector processing to load
      the current processor's IRQ software state into %g6.  It also uses
      __GET_CPUID and clobbers %g1.
      
      Finally, TRAP_LOAD_PGD_PHYS loads the physical address base of the
      current address space's page tables into %g7, it clobbers %g1 and uses
      __GET_CPUID.
      
      Many refinements are possible, as well as some tuning, with this stuff
      in place.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56fb4df6
    • D
      [SPARC64]: Kill pgtable quicklists and use SLAB. · 3c936465
      David S. Miller 提交于
      Taking a nod from the powerpc port.
      
      With the per-cpu caching of both the page allocator and SLAB, the
      pgtable quicklist scheme becomes relatively silly and primitive.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3c936465
    • D
      [SPARC64]: No need to D-cache color page tables any longer. · 05e28f9d
      David S. Miller 提交于
      Unlike the virtual page tables, the new TSB scheme does not
      require this ugly hack.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05e28f9d
  8. 26 9月, 2005 1 次提交
    • D
      [SPARC64]: Probe D/I/E-cache config and use. · 80dc0d6b
      David S. Miller 提交于
      At boot time, determine the D-cache, I-cache and E-cache size and
      line-size.  Use them in cache flushes when appropriate.
      
      This change was motivated by discovering that the D-cache on
      UltraSparc-IIIi and later are 64K not 32K, and the flushes done by the
      Cheetah error handlers were assuming a 32K size.
      
      There are still some pieces of code that are hard coding things and
      will need to be fixed up at some point.
      
      While we're here, fix the D-cache and I-cache parity error handlers
      to run with interrupts disabled, and when the trap occurs at trap
      level > 1 log the event via a counter displayed in /proc/cpuinfo.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80dc0d6b
  9. 30 8月, 2005 1 次提交
  10. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4