1. 17 9月, 2011 1 次提交
  2. 03 8月, 2011 1 次提交
  3. 28 7月, 2011 1 次提交
    • D
      sparc: Detect and handle UltraSPARC-T3 cpu types. · 4ba991d3
      David S. Miller 提交于
      The cpu compatible string we look for is "SPARC-T3".
      
      As far as memset/memcpy optimizations go, we treat this chip the same
      as Niagara-T2/T2+.  Use cache initializing stores for memset, and use
      perfetch, FPU block loads, cache initializing stores, and block stores
      for copies.
      
      We use the Niagara-T2 perf support, since T3 is a close relative in
      this regard.  Later we'll add support for the new events T3 can
      report, plus enable T3's new "sample" mode.
      
      For now I haven't added any new ELF hwcap flags.  We probably need
      to add a couple, for example:
      
      T2 and T3 both support the population count instruction in hardware.
      
      T3 supports VIS3 instructions, including support (finally) for
      partitioned shift.  One can also now move directly between float
      and integer registers.
      
      T3 supports instructions meant to help with Galois Field and other HPC
      calculations, such as XOR multiply.  Also there are "OP and negate"
      instructions, for example "fnmul" which is multiply-and-negate.
      
      T3 recognizes the transactional memory opcodes, however since
      transactional memory isn't supported: 1) 'commit' behaves as a NOP and
      2) 'chkpt' always branches 3) 'rdcps' returns all zeros and 4) 'wrcps'
      behaves as a NOP.
      
      So we'll need about 3 new elf capability flags in the end to represent
      all of these things.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ba991d3
  4. 31 3月, 2011 1 次提交
  5. 16 6月, 2009 1 次提交
  6. 28 4月, 2009 1 次提交
  7. 30 3月, 2009 1 次提交
  8. 09 2月, 2009 1 次提交
    • D
      sparc64: Kill .fixup section bloat. · 40bdac7d
      David S. Miller 提交于
      This is an implementation of a suggestion made by Chris Torek:
      --------------------
      Something else I noticed in passing: the EX and EX_LD/EX_ST macros
      scattered throughout the various .S files make a fair bit of .fixup
      code, all of which does the same thing.  At the cost of one symbol
      in copy_in_user.S, you could just have one common two-instruction
      retl-and-mov-1 fixup that they all share.
      --------------------
      
      The following is with a defconfig build:
      
         text	   data	    bss	    dec	    hex	filename
      3972767	 344024	 584449	4901240	 4ac978	vmlinux.orig
      39688877	 344024	 584449	4897360	 4aba50	vmlinux
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      40bdac7d
  9. 05 12月, 2008 2 次提交
  10. 01 9月, 2008 1 次提交
  11. 28 4月, 2008 1 次提交
  12. 22 3月, 2008 1 次提交
    • D
      [SPARC64]: Remove most limitations to kernel image size. · 64658743
      David S. Miller 提交于
      Currently kernel images are limited to 8MB in size, and this causes
      problems especially when enabling features that take up a lot of
      kernel image space such as lockdep.
      
      The code now will align the kernel image size up to 4MB and map that
      many locked TLB entries.  So, the only practical limitation is the
      number of available locked TLB entries which is 16 on Cheetah and 64
      on pre-Cheetah sparc64 cpus.  Niagara cpus don't actually have hw
      locked TLB entry support.  Rather, the hypervisor transparently
      provides support for "locked" TLB entries since it runs with physical
      addressing and does the initial TLB miss processing.
      
      Fully utilizing this change requires some help from SILO, a patch for
      which will be submitted to the maintainer.  Essentially, SILO will
      only currently map up to 8MB for the kernel image and that needs to be
      increased.
      
      Note that neither this patch nor the SILO bits will help with network
      booting.  The openfirmware code will only map up to a certain amount
      of kernel image during a network boot and there isn't much we can to
      about that other than to implemented a layered network booting
      facility.  Solaris has this, and calls it "wanboot" and we may
      implement something similar at some point.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      64658743
  13. 07 2月, 2008 1 次提交
  14. 17 9月, 2007 1 次提交
    • D
      [SPARC64]: Fix lockdep, particularly on SMP. · 301feb65
      David S. Miller 提交于
      As noted by Al Viro, when we try to call prom_set_trap_table()
      in the SMP trampoline code we try to take the PROM call spinlock
      which doesn't work because the current thread pointer isn't
      valid yet and lockdep depends upon that being correct.
      
      Furthermore, we cannot set the current thread pointer register
      because it can't be properly dereferenced until we return from
      prom_set_trap_table().  Kernel TLB misses only work after that
      call.
      
      So do the PROM call to set the trap table directly instead of
      going through the OBP library C code, and thus avoid the lock
      altogether.
      
      These calls are guarenteed to be serialized fully.
      
      Since there are now no calls to the prom_set_trap_table{_sun4v}()
      library functions, they can be deleted.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      301feb65
  15. 16 8月, 2007 2 次提交
  16. 09 8月, 2007 1 次提交
  17. 25 7月, 2007 1 次提交
    • D
      [SPARC64]: Mark most of initial bootup asm as .text.init.ref_ok · 1966287d
      David S. Miller 提交于
      We can't mark the whole thing init because there are dependencies
      in bootloaders that assume that _start, or whatever the image
      entry value, is 2 instructions before the "HdrS" signature.
      
      In fact, TILO assumes this entry is always at 0x4000, yikes!
      
      Also, right after the bootloader info area there are OBP strings and
      values that get used later in the boot process, and those are not all
      provably .init yet.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1966287d
  18. 20 7月, 2007 1 次提交
  19. 29 5月, 2007 2 次提交
    • D
      [SPARC64]: Fix two bugs wrt. kernel 4MB TSB. · 2d9e2763
      David S. Miller 提交于
      1) The TSB lookup was not using the correct hash mask.
      
      2) It was not aligned on a boundary equal to it's size,
         which is required by the sun4v Hypervisor.
      
      wasn't having it's return value checked, and that bug will be fixed up
      as well in a subsequent changeset.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d9e2763
    • D
      [SPARC64]: Eliminate NR_CPUS limitations. · 22adb358
      David S. Miller 提交于
      Cheetah systems can have cpuids as large as 1023, although physical
      systems don't have that many cpus.
      
      Only three limitations existed in the kernel preventing arbitrary
      NR_CPUS values:
      
      1) dcache dirty cpu state stored in page->flags on
         D-cache aliasing platforms.  With some build time
         calculations and some build-time BUG checks on
         page->flags layout, this one was easily solved.
      
      2) The cheetah XCALL delivery code could only handle
         a cpumask with up to 32 cpus set.  Some simple looping
         logic clears that up too.
      
      3) thread_info->cpu was a u8, easily changed to a u16.
      
      There are a few spots in the kernel that still put NR_CPUS
      sized arrays on the kernel stack, but that's not a sparc64
      specific problem.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22adb358
  20. 18 12月, 2006 1 次提交
  21. 10 12月, 2006 1 次提交
  22. 15 7月, 2006 1 次提交
  23. 01 7月, 2006 1 次提交
  24. 31 5月, 2006 1 次提交
  25. 20 3月, 2006 13 次提交
    • D
      [SPARC64]: Put syscall tables after trap table. · 074d82cf
      David S. Miller 提交于
      Otherwise with too much stuff enabled in the kernel config
      we can end up with an unaligned trap table.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      074d82cf
    • D
      8ca2557c
    • D
      [SPARC64]: Fix sun4v early bootup. · 6cebb520
      David S. Miller 提交于
      prom_sun4v_name should be "sun4v" not "SUNW,sun4v"
      
      Also, this is too early to make use of the
      .sun4v_Xinsn_patch code patching, so just check
      things manually.
      
      This gets us at least to prom_init() on Niagara.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6cebb520
    • D
      [SPARC64]: Use ASI_SCRATCHPAD address 0x0 properly. · 12eaa328
      David S. Miller 提交于
      This is where the virtual address of the fault status
      area belongs.
      
      To set it up we don't make a hypervisor call, instead
      we call OBP's SUNW,set-trap-table with the real address
      of the fault status area as the second argument.  And
      right before that call we write the virtual address into
      ASI_SCRATCHPAD vaddr 0x0.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12eaa328
    • D
      [SPARC64]: Detect sun4v early in boot process. · d82ace7d
      David S. Miller 提交于
      We look for "SUNW,sun4v" in the 'compatible' property
      of the root OBP device tree node.
      
      Protect every %ver register access, to make sure it is
      not touched on sun4v, as %ver is hyperprivileged there.
      
      Lock kernel TLB entries using hypervisor calls instead of
      calls into OBP.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d82ace7d
    • D
      [SPARC64]: Sun4v interrupt handling. · 5b0c0572
      David S. Miller 提交于
      Sun4v has 4 interrupt queues: cpu, device, resumable errors,
      and non-resumable errors.  A set of head/tail offset pointers
      help maintain a work queue in physical memory.  The entries
      are 64-bytes in size.
      
      Each queue is allocated then registered with the hypervisor
      as we bring cpus up.
      
      The two error queues each get a kernel side buffer that we
      use to quickly empty the main interrupt queue before we
      call up to C code to log the event and possibly take evasive
      action.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b0c0572
    • D
      [SPARC64]: Patch up mmu context register writes for sun4v. · 8b11bd12
      David S. Miller 提交于
      sun4v uses ASI_MMU instead of ASI_DMMU
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8b11bd12
    • D
      [SPARC64]: Niagara copy/clear page. · 8591e302
      David S. Miller 提交于
      Happily we have no D-cache aliasing issues on these
      chips, so the implementation is very straightforward.
      
      Add a stub in bootup which will be where the patching
      calls will be made for niagara/sun4v/hypervisor.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8591e302
    • D
      [SPARC64]: Initial sun4v TLB miss handling infrastructure. · d257d5da
      David S. Miller 提交于
      Things are a little tricky because, unlike sun4u, we have
      to:
      
      1) do a hypervisor trap to do the TLB load.
      2) do the TSB lookup calculations by hand
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d257d5da
    • D
      2f7ee7c6
    • D
      [SPARC64]: Kill sole argument passed to setup_tba(). · a8b900d8
      David S. Miller 提交于
      No longer used, and move extern declaration to a header file.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8b900d8
    • D
      [SPARC64]: Elminate all usage of hard-coded trap globals. · 56fb4df6
      David S. Miller 提交于
      UltraSPARC has special sets of global registers which are switched to
      for certain trap types.  There is one set for MMU related traps, one
      set of Interrupt Vector processing, and another set (called the
      Alternate globals) for all other trap types.
      
      For what seems like forever we've hard coded the values in some of
      these trap registers.  Some examples include:
      
      1) Interrupt Vector global %g6 holds current processors interrupt
         work struct where received interrupts are managed for IRQ handler
         dispatch.
      
      2) MMU global %g7 holds the base of the page tables of the currently
         active address space.
      
      3) Alternate global %g6 held the current_thread_info() value.
      
      Such hardcoding has resulted in some serious issues in many areas.
      There are some code sequences where having another register available
      would help clean up the implementation.  Taking traps such as
      cross-calls from the OBP firmware requires some trick code sequences
      wherein we have to save away and restore all of the special sets of
      global registers when we enter/exit OBP.
      
      We were also using the IMMU TSB register on SMP to hold the per-cpu
      area base address, which doesn't work any longer now that we actually
      use the TSB facility of the cpu.
      
      The implementation is pretty straight forward.  One tricky bit is
      getting the current processor ID as that is different on different cpu
      variants.  We use a stub with a fancy calling convention which we
      patch at boot time.  The calling convention is that the stub is
      branched to and the (PC - 4) to return to is in register %g1.  The cpu
      number is left in %g6.  This stub can be invoked by using the
      __GET_CPUID macro.
      
      We use an array of per-cpu trap state to store the current thread and
      physical address of the current address space's page tables.  The
      TRAP_LOAD_THREAD_REG loads %g6 with the current thread from this
      table, it uses __GET_CPUID and also clobbers %g1.
      
      TRAP_LOAD_IRQ_WORK is used by the interrupt vector processing to load
      the current processor's IRQ software state into %g6.  It also uses
      __GET_CPUID and clobbers %g1.
      
      Finally, TRAP_LOAD_PGD_PHYS loads the physical address base of the
      current address space's page tables into %g7, it clobbers %g1 and uses
      __GET_CPUID.
      
      Many refinements are possible, as well as some tuning, with this stuff
      in place.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      56fb4df6
    • D
      [SPARC64]: Move away from virtual page tables, part 1. · 74bf4312
      David S. Miller 提交于
      We now use the TSB hardware assist features of the UltraSPARC
      MMUs.
      
      SMP is currently knowingly broken, we need to find another place
      to store the per-cpu base pointers.  We hid them away in the TSB
      base register, and that obviously will not work any more :-)
      
      Another known broken case is non-8KB base page size.
      
      Also noticed that flush_tlb_all() is not referenced anywhere, only
      the internal __flush_tlb_all() (local cpu only) is used by the
      sparc64 port, so we can get rid of flush_tlb_all().
      
      The kernel gets it's own 8KB TSB (swapper_tsb) and each address space
      gets it's own private 8K TSB.  Later we can add code to dynamically
      increase the size of per-process TSB as the RSS grows.  An 8KB TSB is
      good enough for up to about a 4MB RSS, after which the TSB starts to
      incur many capacity and conflict misses.
      
      We even accumulate OBP translations into the kernel TSB.
      
      Another area for refinement is large page size support.  We could use
      a secondary address space TSB to handle those.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      74bf4312