1. 15 4月, 2015 4 次提交
  2. 18 7月, 2014 1 次提交
    • R
      ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+ · 6ebbf2ce
      Russell King 提交于
      ARMv6 and greater introduced a new instruction ("bx") which can be used
      to return from function calls.  Recent CPUs perform better when the
      "bx lr" instruction is used rather than the "mov pc, lr" instruction,
      and this sequence is strongly recommended to be used by the ARM
      architecture manual (section A.4.1.1).
      
      We provide a new macro "ret" with all its variants for the condition
      code which will resolve to the appropriate instruction.
      
      Rather than doing this piecemeal, and miss some instances, change all
      the "mov pc" instances to use the new macro, with the exception of
      the "movs" instruction and the kprobes code.  This allows us to detect
      the "mov pc, lr" case and fix it up - and also gives us the possibility
      of deploying this for other registers depending on the CPU selection.
      Reported-by: NWill Deacon <will.deacon@arm.com>
      Tested-by: Stephen Warren <swarren@nvidia.com> # Tegra Jetson TK1
      Tested-by: Robert Jarzmik <robert.jarzmik@free.fr> # mioa701_bootresume.S
      Tested-by: Andrew Lunn <andrew@lunn.ch> # Kirkwood
      Tested-by: NShawn Guo <shawn.guo@freescale.com>
      Tested-by: Tony Lindgren <tony@atomide.com> # OMAPs
      Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> # Armada XP, 375, 385
      Acked-by: Sekhar Nori <nsekhar@ti.com> # DaVinci
      Acked-by: Christoffer Dall <christoffer.dall@linaro.org> # kvm/hyp
      Acked-by: Haojian Zhuang <haojian.zhuang@gmail.com> # PXA3xx
      Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> # Xen
      Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # ARMv7M
      Tested-by: Simon Horman <horms+renesas@verge.net.au> # Shmobile
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      6ebbf2ce
  3. 26 5月, 2014 1 次提交
  4. 29 12月, 2013 1 次提交
    • L
      ARM: 7919/1: mm: refactor v7 cache cleaning ops to use way/index sequence · 70f665fe
      Lorenzo Pieralisi 提交于
      Set-associative caches on all v7 implementations map the index bits
      to physical addresses LSBs and tag bits to MSBs. As the last level
      of cache on current and upcoming ARM systems grows in size,
      this means that under normal DRAM controller configurations, the
      current v7 cache flush routine using set/way operations triggers a
      DRAM memory controller precharge/activate for every cache line
      writeback since the cache routine cleans lines by first fixing the
      index and then looping through ways (index bits are mapped to lower
      physical addresses on all v7 cache implementations; this means that,
      with last level cache sizes in the order of MBytes, lines belonging
      to the same set but different ways map to different DRAM pages).
      
      Given the random content of cache tags, swapping the order between
      indexes and ways loops do not prevent DRAM pages precharge and
      activate cycles but at least, on average, improves the chances that
      either multiple lines hit the same page or multiple lines belong to
      different DRAM banks, improving throughput significantly.
      
      This patch swaps the inner loops in the v7 cache flushing routine
      to carry out the clean operations first on all sets belonging to
      a given way (looping through sets) and then decrementing the way.
      
      Benchmarks showed that by swapping the ordering in which sets and
      ways are decremented in the v7 cache flushing routine, that uses
      set/way operations, time required to flush caches is reduced
      significantly, owing to improved writebacks throughput to the DRAM
      controller.
      
      Benchmarks results vary and depend heavily on the last level of
      cache tag RAM content when cache is cleaned and invalidated, ranging
      from 2x throughput when all tag RAM entries contain dirty lines
      mapping to sequential pages of RAM to 1x (ie no improvement) when
      all tag RAM accesses trigger a DRAM precharge/activate cycle, as the
      current code implies on most DRAM controller configurations.
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NNicolas Pitre <nico@linaro.org>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Reviewed-by: NDave Martin <Dave.Martin@arm.com>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      70f665fe
  5. 12 8月, 2013 1 次提交
  6. 17 6月, 2013 1 次提交
  7. 12 2月, 2013 1 次提交
  8. 20 12月, 2012 1 次提交
  9. 29 9月, 2012 1 次提交
  10. 25 9月, 2012 2 次提交
    • L
      ARM: mm: rename jump labels in v7_flush_dcache_all function · 3287be8c
      Lorenzo Pieralisi 提交于
      This patch renames jump labels in v7_flush_dcache_all in order to define
      a specific flush cache levels entry point.
      Acked-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Tested-by: NShawn Guo <shawn.guo@linaro.org>
      3287be8c
    • L
      ARM: mm: implement LoUIS API for cache maintenance ops · 031bd879
      Lorenzo Pieralisi 提交于
      ARM v7 architecture introduced the concept of cache levels and related
      control registers. New processors like A7 and A15 embed an L2 unified cache
      controller that becomes part of the cache level hierarchy. Some operations in
      the kernel like cpu_suspend and __cpu_disable do not require a flush of the
      entire cache hierarchy to DRAM but just the cache levels belonging to the
      Level of Unification Inner Shareable (LoUIS), which in most of ARM v7 systems
      correspond to L1.
      
      The current cache flushing API used in cpu_suspend and __cpu_disable,
      flush_cache_all(), ends up flushing the whole cache hierarchy since for
      v7 it cleans and invalidates all cache levels up to Level of Coherency
      (LoC) which cripples system performance when used in hot paths like hotplug
      and cpuidle.
      
      Therefore a new kernel cache maintenance API must be added to cope with
      latest ARM system requirements.
      
      This patch adds flush_cache_louis() to the ARM kernel cache maintenance API.
      
      This function cleans and invalidates all data cache levels up to the
      Level of Unification Inner Shareable (LoUIS) and invalidates the instruction
      cache for processors that support it (> v7).
      
      This patch also creates an alias of the cache LoUIS function to flush_kern_all
      for all processor versions prior to v7, so that the current cache flushing
      behaviour is unchanged for those processors.
      
      v7 cache maintenance code implements a cache LoUIS function that cleans and
      invalidates the D-cache up to LoUIS and invalidates the I-cache, according
      to the new API.
      Reviewed-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>
      Reviewed-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Tested-by: NShawn Guo <shawn.guo@linaro.org>
      031bd879
  11. 02 5月, 2012 1 次提交
  12. 16 2月, 2012 1 次提交
  13. 10 2月, 2012 1 次提交
    • S
      ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR · b46c0f74
      Stephen Boyd 提交于
      armv7's flush_cache_all() flushes caches via set/way. To
      determine the cache attributes (line size, number of sets,
      etc.) the assembly first writes the CSSELR register to select a
      cache level and then reads the CCSIDR register. The CSSELR register
      is banked per-cpu and is used to determine which cache level CCSIDR
      reads. If the task is migrated between when the CSSELR is written and
      the CCSIDR is read the CCSIDR value may be for an unexpected cache
      level (for example L1 instead of L2) and incorrect cache flushing
      could occur.
      
      Disable interrupts across the write and read so that the correct
      cache attributes are read and used for the cache flushing
      routine. We disable interrupts instead of disabling preemption
      because the critical section is only 3 instructions and we want
      to call v7_dcache_flush_all from __v7_setup which doesn't have a
      full kernel stack with a struct thread_info.
      
      This fixes a problem we see in scm_call() when flush_cache_all()
      is called from preemptible context and sometimes the L2 cache is
      not properly flushed out.
      Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: NNicolas Pitre <nico@linaro.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      b46c0f74
  14. 17 9月, 2011 1 次提交
  15. 07 7月, 2011 1 次提交
  16. 26 5月, 2011 1 次提交
  17. 31 3月, 2011 1 次提交
  18. 13 12月, 2010 1 次提交
  19. 05 10月, 2010 2 次提交
  20. 21 5月, 2010 1 次提交
  21. 08 5月, 2010 1 次提交
  22. 15 2月, 2010 3 次提交
  23. 14 12月, 2009 1 次提交
  24. 07 10月, 2009 1 次提交
  25. 24 7月, 2009 1 次提交
  26. 06 11月, 2008 1 次提交
  27. 01 9月, 2008 1 次提交
  28. 09 5月, 2007 1 次提交