1. 10 7月, 2013 1 次提交
  2. 24 6月, 2013 1 次提交
  3. 20 6月, 2013 1 次提交
    • L
      ARM: kernel: build MPIDR hash function data structure · 8cf72172
      Lorenzo Pieralisi 提交于
      On ARM SMP systems, cores are identified by their MPIDR register.
      The MPIDR guidelines in the ARM ARM do not provide strict enforcement of
      MPIDR layout, only recommendations that, if followed, split the MPIDR
      on ARM 32 bit platforms in three affinity levels. In multi-cluster
      systems like big.LITTLE, if the affinity guidelines are followed, the
      MPIDR can not be considered an index anymore. This means that the
      association between logical CPU in the kernel and the HW CPU identifier
      becomes somewhat more complicated requiring methods like hashing to
      associate a given MPIDR to a CPU logical index, in order for the look-up
      to be carried out in an efficient and scalable way.
      
      This patch provides a function in the kernel that starting from the
      cpu_logical_map, implement collision-free hashing of MPIDR values by checking
      all significative bits of MPIDR affinity level bitfields. The hashing
      can then be carried out through bits shifting and ORing; the resulting
      hash algorithm is a collision-free though not minimal hash that can be
      executed with few assembly instructions. The mpidr is filtered through a
      mpidr mask that is built by checking all bits that toggle in the set of
      MPIDRs corresponding to possible CPUs. Bits that do not toggle do not carry
      information so they do not contribute to the resulting hash.
      
      Pseudo code:
      
      /* check all bits that toggle, so they are required */
      for (i = 1, mpidr_mask = 0; i < num_possible_cpus(); i++)
      	mpidr_mask |= (cpu_logical_map(i) ^ cpu_logical_map(0));
      
      /*
       * Build shifts to be applied to aff0, aff1, aff2 values to hash the mpidr
       * fls() returns the last bit set in a word, 0 if none
       * ffs() returns the first bit set in a word, 0 if none
       */
      fs0 = mpidr_mask[7:0] ? ffs(mpidr_mask[7:0]) - 1 : 0;
      fs1 = mpidr_mask[15:8] ? ffs(mpidr_mask[15:8]) - 1 : 0;
      fs2 = mpidr_mask[23:16] ? ffs(mpidr_mask[23:16]) - 1 : 0;
      ls0 = fls(mpidr_mask[7:0]);
      ls1 = fls(mpidr_mask[15:8]);
      ls2 = fls(mpidr_mask[23:16]);
      bits0 = ls0 - fs0;
      bits1 = ls1 - fs1;
      bits2 = ls2 - fs2;
      aff0_shift = fs0;
      aff1_shift = 8 + fs1 - bits0;
      aff2_shift = 16 + fs2 - (bits0 + bits1);
      u32 hash(u32 mpidr) {
      	u32 l0, l1, l2;
      	u32 mpidr_masked = mpidr & mpidr_mask;
      	l0 = mpidr_masked & 0xff;
      	l1 = mpidr_masked & 0xff00;
      	l2 = mpidr_masked & 0xff0000;
      	return (l0 >> aff0_shift | l1 >> aff1_shift | l2 >> aff2_shift);
      }
      
      The hashing algorithm relies on the inherent properties set in the ARM ARM
      recommendations for the MPIDR. Exotic configurations, where for instance the
      MPIDR values at a given affinity level have large holes, can end up requiring
      big hash tables since the compression of values that can be achieved through
      shifting is somewhat crippled when holes are present. Kernel warns if
      the number of buckets of the resulting hash table exceeds the number of
      possible CPUs by a factor of 4, which is a symptom of a very sparse HW
      MPIDR configuration.
      
      The hash algorithm is quite simple and can easily be implemented in assembly
      code, to be used in code paths where the kernel virtual address space is
      not set-up (ie cpu_resume) and instruction and data fetches are strongly
      ordered so code must be compact and must carry out few data accesses.
      
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Colin Cross <ccross@android.com>
      Cc: Santosh Shilimkar <santosh.shilimkar@ti.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Amit Kucheria <amit.kucheria@linaro.org>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NDave Martin <Dave.Martin@arm.com>
      Reviewed-by: NNicolas Pitre <nico@linaro.org>
      Tested-by: NShawn Guo <shawn.guo@linaro.org>
      Tested-by: NKevin Hilman <khilman@linaro.org>
      Tested-by: NStephen Warren <swarren@wwwdotorg.org>
      8cf72172
  4. 30 5月, 2013 1 次提交
  5. 21 5月, 2013 2 次提交
  6. 16 5月, 2013 1 次提交
    • M
      ARM: 7669/1: keep __my_cpu_offset consistent with generic one · 9394c1c6
      Ming Lei 提交于
      Commit 14318efb(ARM: 7587/1: implement optimized percpu variable access)
      introduces arm's __my_cpu_offset to optimize percpu vaiable access,
      which really works well on hackbench, but will cause __my_cpu_offset
      to return garbage value before it is initialized in cpu_init() called
      by setup_arch, so accessing percpu variable before setup_arch may cause
      kernel hang. But generic __my_cpu_offset always returns zero before
      percpu area is brought up, and won't hang kernel.
      
      So the patch tries to clear __my_cpu_offset on boot CPU early
      to avoid boot hang.
      
      At least now percpu variable is accessed by lockdep before
      setup_arch(), and enabling CONFIG_LOCK_STAT or CONFIG_DEBUG_LOCKDEP
      can trigger kernel hang.
      Signed-off-by: NMing Lei <tom.leiming@gmail.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      9394c1c6
  7. 30 4月, 2013 1 次提交
    • A
      ARM: default machine descriptor for multiplatform · 883a106b
      Arnd Bergmann 提交于
      Since we now have default implementations for init_time and init_irq,
      the init_machine callback is the only one that is not yet optional,
      but since simple DT based platforms all have the same
      of_platform_populate function call in there, we can consolidate them
      as well, and then actually boot with a completely empty machine_desc.
      Unofortunately we cannot just default to an empty init_machine: We
      cannot call of_platform_populate before init_machine because that
      does not work in case of auxdata, and we cannot call it after
      init_machine either because the machine might need to run code
      after adding the devices.
      
      To take the final step, this adds support for booting without defining
      any machine_desc whatsoever.
      
      For the case that CONFIG_MULTIPLATFORM is enabled, it adds a
      global machine descriptor that never matches any machine but is
      used as a fallback if nothing else matches. We assume that without
      CONFIG_MULTIPLATFORM, we only want to boot on the systems that the kernel
      is built for, so we still retain the build-time warning for missing
      machine descriptors and the run-time warning when the platform does not
      match in that case.
      
      In the case that we run on a multiplatform kernel and the machine
      provides a fully populated device tree, we attempt to keep booting,
      hoping that no machine specific callbacks are necessary.
      
      Finally, this also removes the misguided "select ARCH_VEXPRESS" that
      was only added to avoid a build error for allnoconfig kernels.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NNicolas Pitre <nico@linaro.org>
      Acked-by: NOlof Johansson <olof@lixom.net>
      Cc: "Russell King - ARM Linux" <linux@arm.linux.org.uk>
      Cc: Rob Herring <robherring2@gmail.com>
      883a106b
  8. 26 4月, 2013 1 次提交
  9. 18 4月, 2013 1 次提交
  10. 17 4月, 2013 1 次提交
  11. 03 4月, 2013 1 次提交
  12. 23 3月, 2013 2 次提交
  13. 01 2月, 2013 1 次提交
  14. 16 12月, 2012 1 次提交
  15. 03 12月, 2012 1 次提交
  16. 19 11月, 2012 3 次提交
    • L
      ARM: kernel: add cpu logical map DT init in setup_arch · 5587164e
      Lorenzo Pieralisi 提交于
      As soon as the device tree is unflattened the cpu logical to physical
      mapping is carried out in setup_arch to build a proper array of MPIDR and
      corresponding logical indexes.
      
      The mapping could have been carried out using the flattened DT blob and
      related primitives, but since the mapping is not needed by early boot
      code it can safely be executed when the device tree has been uncompressed to
      its tree data structure.
      
      This patch adds the arm_dt_init_cpu maps() function call in setup_arch().
      
      If the kernel is not compiled with DT support the function is empty and
      no logical mapping takes place through it; the mapping carried out in
      smp_setup_processor_id() is left unchanged.
      If DT is supported the mapping created in smp_setup_processor_id() is overriden.
      The DT mapping also sets the possible cpus mask, hence platform
      code need not set it again in the respective smp_init_cpus() functions.
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      Acked-by: NNicolas Pitre <nico@linaro.org>
      5587164e
    • L
      ARM: kernel: smp_setup_processor_id() updates · cb8cf4f8
      Lorenzo Pieralisi 提交于
      This patch applies some basic changes to the smp_setup_processor_id()
      ARM implementation to make the code that builds cpu_logical_map more
      uniform across the kernel.
      
      The function now prints the full extent of the boot CPU MPIDR[23:0] and
      initializes the cpu_logical_map for CPUs up to nr_cpu_ids.
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: NNicolas Pitre <nico@linaro.org>
      Acked-by: NWill Deacon <will.deacon@arm.com>
      cb8cf4f8
    • L
      ARM: kernel: update cpuinfo to print all online CPUs features · b4b8f770
      Lorenzo Pieralisi 提交于
      Currently, reading /proc/cpuinfo provides userspace with CPU ID of
      the CPU carrying out the read from the file. This is fine as long as all
      CPUs in the system are the same. With the advent of big.LITTLE and
      heterogenous ARM systems this approach provides user space with incorrect
      bits of information since CPU ids in the system might differ from the one
      provided by the CPU reading the file.
      
      This patch updates the cpuinfo show function so that a read from
      /proc/cpuinfo prints HW information for all online CPUs at once, mirroring
       x86 behaviour.
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: NNicolas Pitre <nico@linaro.org>
      b4b8f770
  17. 19 9月, 2012 1 次提交
  18. 13 9月, 2012 1 次提交
    • M
      ARM: SoC: add per-platform SMP operations · abcee5fb
      Marc Zyngier 提交于
      This adds a 'struct smp_operations' to abstract the CPU initialization
      and hot plugging functions on SMP systems, which otherwise conflict
      in a multiplatform kernel. This also helps shmobile and potentially
      others that have more than one method to do these.
      
      To allow the kernel to continue building, the platform hooks are
      defined as weak symbols which are overrided by the platform code.
      Once all platforms are converted, the "weak" attribute will be
      removed and the function made static.
      
      Unlike the original version from Marc, this new version from Arnd
      does not use a generalized abstraction for per-soc data structures
      but only tries to solve the problem for the SMP operations. This
      way, we can collapse the previous four data structures into a
      single struct, which is less systematic but also easier to follow
      as a causal reader.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: NNicolas Pitre <nico@fluxnic.net>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      abcee5fb
  19. 04 9月, 2012 1 次提交
  20. 30 7月, 2012 1 次提交
  21. 21 5月, 2012 1 次提交
  22. 03 5月, 2012 1 次提交
  23. 16 4月, 2012 1 次提交
  24. 29 3月, 2012 2 次提交
  25. 24 3月, 2012 1 次提交
  26. 23 1月, 2012 2 次提交
  27. 20 1月, 2012 2 次提交
  28. 12 12月, 2011 1 次提交
  29. 09 12月, 2011 1 次提交
    • T
      memblock: Fix include breakages caused by 24aa0788 · 1c16d242
      Tejun Heo 提交于
      24aa0788 (memblock, x86: Replace memblock_x86_reserve/free_range()
      with generic ones) removed arch/x86/include/asm/memblock.h and dropped
      its inclusion from include/linux/memblock.h which breaks other
      architectures which depended on the generic memblock.h pulling in the
      arch specific one.
      
      However, the proper fix isn't adding back the asm inclusion.  memblock
      doesn't have any arch dependent part and doesn't need arch specific
      header file and asm/memblock.h files are either practically empty or
      contain mostly unrelated arch specific stuff.
      
      * In microblaze, sh, powerpc, sparc and openrisc, asm/memblock.h is
        either empty or just contains unused MEMBLOCK_DBG() macro.  Remove
        them.
      
      * In arm and unicore32, asm/memblock.h contains arch specific stuff.
        Include it directly from its users.  It might be a good idea to
        rename the header file to avoid confusion.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: N"H. Peter Anvin" <hpa@zytor.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      1c16d242
  30. 06 12月, 2011 1 次提交
    • U
      ARM: 7187/1: fix unwinding for XIP kernels · de66a979
      Uwe Kleine-König 提交于
      The linker places the unwind tables in readonly sections. So when using
      an XIP kernel these are located in ROM and cannot be modified.
      For that reason the current approach to convert the relative offsets in
      the unwind index to absolute addresses early in the boot process doesn't
      work with XIP.
      
      The offsets in the unwind index section are signed 31 bit numbers and
      the structs are sorted by this offset. So it first has offsets between
      0x40000000 and 0x7fffffff (i.e. the negative offsets) and then offsets
      between 0x00000000 and 0x3fffffff. When seperating these two blocks the
      numbers are sorted even when interpreting the offsets as unsigned longs.
      
      So determine the first non-negative entry once and track that using the
      new origin pointer. The actual bisection can then use a plain unsigned
      long comparison. The only thing that makes the new bisection more
      complicated is that the offsets are relative to their position in the
      index section, so the key to search needs to be adapted accordingly in
      each step.
      
      Moreover several consts are added to catch future writes and rename the
      member "addr" of struct unwind_idx to "addr_offset" to better match the
      new semantic. (This has the additional benefit of breaking eventual
      users at compile time to make them aware of the change.)
      
      In my tests the new algorithm was a tad faster than the original and has
      the additional upside of not needing the initial conversion and so saves
      some boot time and it's possible to unwind even earlier.
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: NNicolas Pitre <nico@fluxnic.net>
      Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      de66a979
  31. 19 11月, 2011 1 次提交
    • N
      ARM: sort the meminfo array earlier · 27a3f0e9
      Nicolas Pitre 提交于
      The meminfo array has to be sorted before sanity_check_meminfo() in
      arch/arm/mm/mmu.c is called for it to work properly.  This also allows
      for a simpler find_limits() in arch/arm/mm/init.c.
      
      The sort is moved to arch/arm/kernel/setup.c because that's where the
      meminfo array is populated.  Eventually this should be improved upon
      to make the memory bank parser a bit more robust against problems
      such as overlapping memory ranges.
      Signed-off-by: NNicolas Pitre <nicolas.pitre@linaro.org>
      27a3f0e9
  32. 12 11月, 2011 1 次提交
  33. 11 11月, 2011 1 次提交