1. 24 11月, 2009 2 次提交
    • S
      sh: Minor optimisations to FPU handling · d3ea9fa0
      Stuart Menefy 提交于
      A number of small optimisations to FPU handling, in particular:
      
       - move the task USEDFPU flag from the thread_info flags field (which
         is accessed asynchronously to the thread) to a new status field,
         which is only accessed by the thread itself. This allows locking to
         be removed in most cases, or can be reduced to a preempt_lock().
         This mimics the i386 behaviour.
      
       - move the modification of regs->sr and thread_info->status flags out
         of save_fpu() to __unlazy_fpu(). This gives the compiler a better
         chance to optimise things, as well as making save_fpu() symmetrical
         with restore_fpu() and init_fpu().
      
       - implement prepare_to_copy(), so that when creating a thread, we can
         unlazy the FPU prior to copying the thread data structures.
      
      Also make sure that the FPU is disabled while in the kernel, in
      particular while booting, and for newly created kernel threads,
      
      In a very artificial benchmark, the execution time for 2500000
      context switches was reduced from 50 to 45 seconds.
      Signed-off-by: NStuart Menefy <stuart.menefy@st.com>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      d3ea9fa0
    • G
      sh: add sleazy FPU optimization · a0458b07
      Giuseppe CAVALLARO 提交于
      sh port of the sLeAZY-fpu feature currently implemented for some architectures
      such us i386.
      
      Right now the SH kernel has a 100% lazy fpu behaviour.
      This is of course great for applications that have very sporadic or no FPU use.
      However for very frequent FPU users...  you take an extra trap every context
      switch.
      The patch below adds a simple heuristic to this code: after 5 consecutive
      context switches of FPU use, the lazy behavior is disabled and the context
      gets restored every context switch.
      After 256 switches, this is reset and the 100% lazy behavior is returned.
      
      Tests with LMbench showed no regression.
      I saw a little improvement due to the prefetching (~2%).
      
      The tests below also show that, with this sLeazy patch, indeed,
      the number of FPU exceptions is reduced.
      To test this. I hacked the lat_ctx LMBench to use the FPU a little more.
      
         sLeasy implementation
         ===========================================
         switch_to calls            |  79326
         sleasy   calls             |  42577
         do_fpu_state_restore  calls|  59232
         restore_fpu   calls        |  59032
      
         Exceptions:  0x800 (FPU disabled  ): 16604
      
         100% Leazy (default implementation)
         ===========================================
         switch_to  calls            |  79690
         do_fpu_state_restore calls  |  53299
         restore_fpu  calls          |   53101
      
         Exceptions: 0x800 (FPU disabled  ):  53273
      Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com>
      Signed-off-by: NStuart Menefy <stuart.menefy@st.com>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      a0458b07
  2. 12 11月, 2009 1 次提交
  3. 05 11月, 2009 1 次提交
  4. 27 8月, 2009 1 次提交
  5. 21 8月, 2009 1 次提交
  6. 15 8月, 2009 2 次提交
    • P
      sh: Kill off the unhandled pvr case in SH-4 CPU probing. · eccee745
      Paul Mundt 提交于
      This is superfluous, as the default CPU type and family are already
      established by the initial cpuinfo definition. Given that we are still
      able to probe for the CPU family even if we are not able to detect the
      subtype, it's preferable to let the probing code fill out what it can and
      leave the rest.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      eccee745
    • P
      sh: Track the CPU family in sh_cpuinfo. · e82da214
      Paul Mundt 提交于
      This adds a family member to struct sh_cpuinfo, which allows us to fall
      back more on the probe routines to work out what sort of subtype we are
      running on. This will be used by the CPU cache initialization code in
      order to first do family-level initialization, followed by subtype-level
      optimizations.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      e82da214
  7. 13 8月, 2009 1 次提交
  8. 23 7月, 2009 1 次提交
  9. 01 6月, 2009 3 次提交
  10. 13 5月, 2009 2 次提交
  11. 12 5月, 2009 6 次提交
  12. 11 5月, 2009 3 次提交
  13. 16 4月, 2009 1 次提交
  14. 02 4月, 2009 1 次提交
    • P
      sh: Kill off broken direct-mapped cache mode. · e8208828
      Paul Mundt 提交于
      Forcing direct-mapped worked on certain older 2-way set associative
      parts, but was always error prone on 4-way parts. As these are the
      norm these days, there is not much point in continuing to support this
      mode. Most of the folks that used direct-mapped mode generally just
      wanted writethrough caching in the first place..
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      e8208828
  15. 17 3月, 2009 1 次提交
    • P
      sh: Support for extended ASIDs on PTEAEX-capable SH-X3 cores. · 8263a67e
      Paul Mundt 提交于
      This adds support for extended ASIDs (up to 16-bits) on newer SH-X3 cores
      that implement the PTAEX register and respective functionality. Presently
      only the 65nm SH7786 (90nm only supports legacy 8-bit ASIDs).
      
      The main change is in how the PTE is written out when loading the entry
      in to the TLB, as well as in how the TLB entry is selectively flushed.
      
      While SH-X2 extended mode splits out the memory-mapped U and I-TLB data
      arrays for extra bits, extended ASID mode splits out the address arrays.
      While we don't use the memory-mapped data array access, the address
      array accesses are necessary for selective TLB flushes, so these are
      implemented newly and replace the generic SH-4 implementation.
      
      With this, TLB flushes in switch_mm() are almost non-existent on newer
      parts.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      8263a67e
  16. 10 3月, 2009 1 次提交
    • M
      sh: hibernation support · 2ef7f0da
      Magnus Damm 提交于
      Add Suspend-to-disk / swsusp / CONFIG_HIBERNATION support
      to the SuperH architecture.
      
      To suspend, use "swapon /dev/sda2; echo disk > /sys/power/state"
      To resume, pass "resume=/dev/sda2" on the kernel command line.
      
      The patch "pm: rework includes, remove arch ifdefs V2" is
      needed to allow the generic swsusp code to build properly.
      
      Hibernation is not enabled with this patch though, a patch
      setting ARCH_HIBERNATION_POSSIBLE will be submitted later.
      Signed-off-by: NMagnus Damm <damm@igel.co.jp>
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      2ef7f0da
  17. 03 3月, 2009 1 次提交
  18. 27 2月, 2009 1 次提交
  19. 29 1月, 2009 1 次提交
  20. 22 12月, 2008 1 次提交
  21. 08 9月, 2008 2 次提交
  22. 04 8月, 2008 1 次提交
  23. 29 7月, 2008 1 次提交
  24. 28 7月, 2008 2 次提交
  25. 23 5月, 2008 1 次提交
  26. 19 4月, 2008 1 次提交
    • P
      sh: Fix up L2 cache probe. · 440fc172
      Paul Mundt 提交于
      SH7723 is the first hard silicon to implement the L2, and unsurprisingly,
      does the precise inverse of what the specification alleges. XOR the
      URAM/L2 size bits to get back in line with the existing parsing logic.
      Signed-off-by: NPaul Mundt <lethal@linux-sh.org>
      440fc172