1. 06 9月, 2016 1 次提交
    • M
      ARM: 8611/1: l2x0: add PMU support · b828f960
      Mark Rutland 提交于
      The L2C-220 (AKA L220) and L2C-310 (AKA PL310) cache controllers feature
      a Performance Monitoring Unit (PMU), which can be useful for tuning
      and/or debugging. This hardware is always present and the relevant
      registers are accessible to non-secure accesses. Thus, no special
      firmware interface is necessary.
      
      This patch adds support for the PMU, plugging into the usual perf
      infrastructure. The overflow interrupt is not always available (e.g. on
      RealView PBX A9 it is not wired up at all), and the hardware counters
      saturate, so the driver does not make use of this. Instead, the driver
      periodically polls and reset counters as required to avoid losing
      events due to saturation.
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Acked-by: NPawel Moll <pawel.moll@arm.com>
      Tested-by: NKim Phillips <kim.phillips@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      b828f960
  2. 12 8月, 2016 2 次提交
  3. 15 7月, 2016 1 次提交
  4. 06 5月, 2016 1 次提交
  5. 22 12月, 2015 1 次提交
    • L
      ARM: 8482/1: l2x0: make it possible to disable outer sync from DT · 36f46d6d
      Linus Walleij 提交于
      According to commit 2503a5ec
      "ARM: 6201/1: RealView: Do not use outer_sync() on ARM11MPCore
      boards with L220" Some PB11MPCore RealView core tiles have broken
      outer_sync.
      
      We got rid of the custom barriers from the machine by disabling
      outer sync, but that was just for the boardfile case. We have
      to be able to do the same in the device tree case.
      
      Since __l2c_init() is cloning and copying the L2C vtable,
      we pass an argument to this function to optionally numb
      the outer sync operation if desired, before initializing
      the cache.
      
      After this we can set up the cache correctly on the RealView
      PB11MPCore. This was tested on a PB11MPCore known to have the
      issue. Before this, spurious crashes would occur if we try to
      set up the cache properly, after this it boots rock solid.
      
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: devicetree@vger.kernel.org
      Acked-by: NRob Herring <robh@kernel.org>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      36f46d6d
  6. 17 11月, 2015 1 次提交
  7. 10 7月, 2015 1 次提交
    • G
      ARM: 8395/1: l2c: Add support for the "arm,shared-override" property · eeedcea6
      Geert Uytterhoeven 提交于
      "CoreLink Level 2 Cache Controller L2C-310", p. 2-15, section 2.3.2
      Shareable attribute" states:
      
          "The default behavior of the cache controller with respect to the
           shareable attribute is to transform Normal Memory Non-cacheable
           transactions into:
              - cacheable no allocate for reads
              - write through no write allocate for writes."
      
      Depending on the system architecture, this may cause memory corruption
      in the presence of bus mastering devices (e.g. OHCI). To avoid such
      corruption, the default behavior can be disabled by setting the Shared
      Override bit in the Auxiliary Control register.
      
      Currently the Shared Override bit can be set only using C code:
        - by calling l2x0_init() directly, which is deprecated,
        - by setting/clearing the bit in the machine_desc.l2c_aux_val/mask
          fields, but using values differing from 0/~0 is also deprecated.
      
      Hence add support for an "arm,shared-override" device tree property for
      the l2c device node. By specifying this property, affected systems can
      indicate that non-cacheable transactions must not be transformed.
      Then, it's up to the OS to decide. The current behavior is to set the
      "shared attribute override enable" bit, as there may exist kernel linear
      mappings and cacheable aliases for the DMA buffers, even if CMA is
      enabled.
      
      See also commit 1a8e41cd ("ARM: 6395/1: VExpress: Set bit 22 in
      the PL310 (cache controller) AuxCtlr register"):
      
          "Clearing bit 22 in the PL310 Auxiliary Control register (shared
           attribute override enable) has the side effect of transforming
           Normal Shared Non-cacheable reads into Cacheable no-allocate reads.
      
           Coherent DMA buffers in Linux always have a Cacheable alias via the
           kernel linear mapping and the processor can speculatively load
           cache lines into the PL310 controller. With bit 22 cleared,
           Non-cacheable reads would unexpectedly hit such cache lines leading
           to buffer corruption."
      Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      eeedcea6
  8. 11 6月, 2015 1 次提交
  9. 16 5月, 2015 2 次提交
  10. 15 5月, 2015 3 次提交
  11. 18 3月, 2015 1 次提交
  12. 10 3月, 2015 1 次提交
  13. 07 2月, 2015 2 次提交
    • A
      ARM: 8297/1: cache-l2x0: optimize aurora range operations · 1d889679
      Arnd Bergmann 提交于
      The aurora_inv_range(), aurora_clean_range() and aurora_flush_range()
      functions are highly redundant, both in source and in object code, and
      they are harder to understand than necessary.
      
      By moving the range loop into the aurora_pa_range() function, they
      become trivial wrappers, and the object code start looking like what
      one would expect for an optimal implementation.
      
      Further optimization may be possible by using the per-CPU "virtual"
      registers to avoid the spinlocks in most cases.
      
       (on Armada 370 RD and Armada XP GP, boot tested, plus a little bit of
       DMA traffic by reading data from a SD card)
      Reviewed-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Tested-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      1d889679
    • A
      ARM: 8296/1: cache-l2x0: clean up aurora cache handling · 20e783e3
      Arnd Bergmann 提交于
      The aurora cache controller is the only remaining user of a couple
      of functions in this file and are completely unused when that is
      disabled, leading to build warnings:
      
      arch/arm/mm/cache-l2x0.c:167:13: warning: 'l2x0_cache_sync' defined but not used [-Wunused-function]
      arch/arm/mm/cache-l2x0.c:184:13: warning: 'l2x0_flush_all' defined but not used [-Wunused-function]
      arch/arm/mm/cache-l2x0.c:194:13: warning: 'l2x0_disable' defined but not used [-Wunused-function]
      
      With the knowledge that the code is now aurora-specific, we can
      simplify it noticeably:
      
      - The pl310 errata workarounds are not needed on aurora and can be removed
      - As confirmed by Thomas Petazzoni from the data sheet, the cache_wait()
        macro is never needed.
      - No need to hold the lock across atomic cache sync
      - We can load the l2x0_base into a local variable across operations
      
      There should be no functional change in this patch, but readability
      and the generated object code improves, along with avoiding the
      warnings.
      
       (on Armada 370 RD and Armada XP GP, boot tested, plus a little bit of
       DMA traffic by reading data from a SD card)
      Acked-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Tested-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      20e783e3
  14. 20 1月, 2015 2 次提交
  15. 16 1月, 2015 4 次提交
  16. 30 10月, 2014 1 次提交
  17. 29 10月, 2014 1 次提交
  18. 03 10月, 2014 1 次提交
    • L
      ARM: 8169/1: l2c: parse cache properties from ePAPR definitions · f3354ab6
      Linus Walleij 提交于
      When both 'cache-size' and 'cache-sets' are specified for a L2 cache
      controller node, parse those properties and set up the
      set size based on which type of L2 cache controller we are using.
      
      Update the L2 cache controller Device Tree binding with the optional
      'cache-size', 'cache-sets', 'cache-block-size' and 'cache-line-size'
      properties. These come from the ePAPR specification.
      
      Using the cache size, number of sets and cache line size we can
      calculate desired associativity of the L2 cache. This is done
      by the calculation:
      
          set size = cache size / sets
          ways = set size / line size
          way size = cache size / ways = sets * line size
          associativity = cache size / way size
      
      Example output from the PB1176 DT that look like this:
      
      L2: l2-cache {
          compatible = "arm,l220-cache";
          (...)
          arm,override-auxreg;
          cache-size = <131072>; // 128kB
          cache-sets = <512>;
          cache-line-size = <32>;
      };
      
      Ends up like this:
      
      L2C OF: override cache size: 131072 bytes (128KB)
      L2C OF: override line size: 32 bytes
      L2C OF: override way size: 16384 bytes (16KB)
      L2C OF: override associativity: 8
      L2C: DT/platform modifies aux control register: 0x02020fff -> 0x02030fff
      L2C-220 cache controller enabled, 8 ways, 128 kB
      L2C-220: CACHE_ID 0x41000486, AUX_CTRL 0x06030fff
      
      Which is consistent with the value earlier hardcoded for the
      PB1176 platform.
      
      This patch is an extended version based on the initial patch
      by Florian Fainelli.
      Reviewed-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      f3354ab6
  19. 18 7月, 2014 1 次提交
    • R
      ARM: make it easier to check the CPU part number correctly · af040ffc
      Russell King 提交于
      Ensure that platform maintainers check the CPU part number in the right
      manner: the CPU part number is meaningless without also checking the
      CPU implement(e|o)r (choose your preferred spelling!)  Provide an
      interface which returns both the implementer and part number together,
      and update the definitions to include the implementer.
      
      Mark the old function as being deprecated... indeed, using the old
      function with the definitions will now always evaluate as false, so
      people must update their un-merged code to the new function.  While
      this could be avoided by adding new definitions, we'd also have to
      create new names for them which would be awkward.
      Acked-by: NNicolas Pitre <nico@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      af040ffc
  20. 08 7月, 2014 1 次提交
  21. 29 6月, 2014 1 次提交
    • T
      ARM: 8076/1: mm: add support for HW coherent systems in PL310 cache · 98ea2dba
      Thomas Petazzoni 提交于
      When a PL310 cache is used on a system that provides hardware
      coherency, the outer cache sync operation is useless, and can be
      skipped. Moreover, on some systems, it is harmful as it causes
      deadlocks between the Marvell coherency mechanism, the Marvell PCIe
      controller and the Cortex-A9.
      
      To avoid this, this commit introduces a new Device Tree property
      'arm,io-coherent' for the L2 cache controller node, valid only for the
      PL310 cache. It identifies the usage of the PL310 cache in an I/O
      coherent configuration. Internally, it makes the driver disable the
      outer cache sync operation.
      
      Note that technically speaking, a fully coherent system wouldn't
      require any of the other .outer_cache operations. However, in
      practice, when booting secondary CPUs, these are not yet coherent, and
      therefore a set of cache maintenance operations are necessary at this
      point. This explains why we keep the other .outer_cache operations and
      only ->sync is disabled.
      
      While in theory any write to a PL310 register could cause the
      deadlock, in practice, disabling ->sync is sufficient to workaround
      the deadlock, since the other cache maintenance operations are only
      used in very specific situations.
      
      Contrary to previous versions of this patch, this new version does not
      simply NULL-ify the ->sync member, because the l2c_init_data
      structures are now 'const' and therefore cannot be modified, which is
      a good thing. Therefore, this patch introduces a separate
      l2c_init_data instance, called of_l2c310_coherent_data.
      Signed-off-by: NThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      98ea2dba
  22. 30 5月, 2014 10 次提交