1. 25 3月, 2011 1 次提交
    • T
      percpu: Always align percpu output section to PAGE_SIZE · 0415b00d
      Tejun Heo 提交于
      Percpu allocator honors alignment request upto PAGE_SIZE and both the
      percpu addresses in the percpu address space and the translated kernel
      addresses should be aligned accordingly.  The calculation of the
      former depends on the alignment of percpu output section in the kernel
      image.
      
      The linker script macros PERCPU_VADDR() and PERCPU() are used to
      define this output section and the latter takes @align parameter.
      Several architectures are using @align smaller than PAGE_SIZE breaking
      percpu memory alignment.
      
      This patch removes @align parameter from PERCPU(), renames it to
      PERCPU_SECTION() and makes it always align to PAGE_SIZE.  While at it,
      add PCPU_SETUP_BUG_ON() checks such that alignment problems are
      reliably detected and remove percpu alignment comment recently added
      in workqueue.c as the condition would trigger BUG way before reaching
      there.
      
      For um, this patch raises the alignment of percpu area.  As the area
      is in .init, there shouldn't be any noticeable difference.
      
      This problem was discovered by David Howells while debugging boot
      failure on mn10300.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Acked-by: NMike Frysinger <vapier@gentoo.org>
      Cc: uclinux-dist-devel@blackfin.uclinux.org
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: user-mode-linux-devel@lists.sourceforge.net
      0415b00d
  2. 22 2月, 2011 2 次提交
  3. 18 2月, 2011 1 次提交
    • R
      ARM: P2V: introduce phys_to_virt/virt_to_phys runtime patching · dc21af99
      Russell King 提交于
      This idea came from Nicolas, Eric Miao produced an initial version,
      which was then rewritten into this.
      
      Patch the physical to virtual translations at runtime.  As we modify
      the code, this makes it incompatible with XIP kernels, but allows us
      to achieve this with minimal loss of performance.
      
      As many translations are of the form:
      
      	physical = virtual + (PHYS_OFFSET - PAGE_OFFSET)
      	virtual = physical - (PHYS_OFFSET - PAGE_OFFSET)
      
      we generate an 'add' instruction for __virt_to_phys(), and a 'sub'
      instruction for __phys_to_virt().  We calculate at run time (PHYS_OFFSET
      - PAGE_OFFSET) by comparing the address prior to MMU initialization with
      where it should be once the MMU has been initialized, and place this
      constant into the above add/sub instructions.
      
      Once we have (PHYS_OFFSET - PAGE_OFFSET), we can calculate the real
      PHYS_OFFSET as PAGE_OFFSET is a build-time constant, and save this for
      the C-mode PHYS_OFFSET variable definition to use.
      
      At present, we are unable to support Realview with Sparsemem enabled
      as this uses a complex mapping function, and MSM as this requires a
      constant which will not fit in our math instruction.
      
      Add a module version magic string for this feature to prevent
      incompatible modules being loaded.
      Tested-by: NTony Lindgren <tony@atomide.com>
      Reviewed-by: NNicolas Pitre <nicolas.pitre@linaro.org>
      Tested-by: NNicolas Pitre <nicolas.pitre@linaro.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      dc21af99
  4. 25 1月, 2011 1 次提交
    • T
      percpu: align percpu readmostly subsection to cacheline · 19df0c2f
      Tejun Heo 提交于
      Currently percpu readmostly subsection may share cachelines with other
      percpu subsections which may result in unnecessary cacheline bounce
      and performance degradation.
      
      This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR()
      linker macros, makes each arch linker scripts specify its cacheline
      size and use it to align percpu subsections.
      
      This is based on Shaohua's x86 only patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Shaohua Li <shaohua.li@intel.com>
      19df0c2f
  5. 05 12月, 2010 1 次提交
    • R
      ARM: implement support for read-mostly sections · daf87416
      Russell King 提交于
      As our SMP implementation uses MESI protocols.  Grouping together data
      which is mostly only read together means that we avoid unnecessary
      cache line bouncing when this code shares a cache line with other data.
      
      In other words, cache lines associated with read-mostly data are
      expected to spend most of their time in shared state.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      daf87416
  6. 20 11月, 2010 1 次提交
  7. 28 10月, 2010 1 次提交
  8. 08 10月, 2010 2 次提交
  9. 05 10月, 2010 1 次提交
    • R
      ARM: Allow SMP kernels to boot on UP systems · f00ec48f
      Russell King 提交于
      UP systems do not implement all the instructions that SMP systems have,
      so in order to boot a SMP kernel on a UP system, we need to rewrite
      parts of the kernel.
      
      Do this using an 'alternatives' scheme, where the kernel code and data
      is modified prior to initialization to replace the SMP instructions,
      thereby rendering the problematical code ineffectual.  We use the linker
      to generate a list of 32-bit word locations and their replacement values,
      and run through these replacements when we detect a UP system.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      f00ec48f
  10. 16 2月, 2010 1 次提交
  11. 15 12月, 2009 1 次提交
    • A
      ARM: use unified discard definition in linker script · e3f28c13
      Alan Jenkins 提交于
      Commit 023bf6f1 "linker script: unify usage of discard definition"
      changed the linker scripts for all architectures except for ARM.
      I can find no discussion about this ommision, so here are the changes
      for ARM.
      
      These changes are exactly parallel to the ia64 case.
      
      "ia64 is notable because it first throws away some ia64 specific
       subsections and then include the rest of the sections into the final
       image, so those sections must be discarded before the inclusion."
      
      Not boot-tested.  In build testing, the modified linker script generated
      an identical vmlinux file.
      
      [I would like to be able to rely on this unified discard definition.
       I want to sort the kernel symbol tables to allow faster symbol
       resolution during module loading. The simplest way appears to be
       to generate sorted versions from vmlinux.o, link them in to vmlinux,
       _and discard the original unsorted tables_.
      
       This work is driven by my x86 netbook, but it is implemented at a
       generic level. It is possible it will benefit some ARM systems also.]
      Signed-off-by: NAlan Jenkins <alan-jenkins@tuffmail.co.uk>
      Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Acked-by-without-testing: Tejun Heo <tj@kernel.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      e3f28c13
  12. 24 11月, 2009 2 次提交
  13. 16 9月, 2009 1 次提交
  14. 25 6月, 2009 1 次提交
  15. 24 6月, 2009 1 次提交
    • T
      linker script: throw away .discard section · 405d967d
      Tejun Heo 提交于
      x86 throws away .discard section but no other archs do.  Also,
      .discard is not thrown away while linking modules.  Make every arch
      and module linking throw it away.  This will be used to define dummy
      variables for percpu declarations and definitions.
      
      This patch is based on Ivan Kokshaysky's alpha percpu patch.
      
      [ Impact: always throw away everything in .discard ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Bryan Wu <cooloney@kernel.org>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      405d967d
  16. 19 6月, 2009 1 次提交
  17. 30 5月, 2009 1 次提交
  18. 10 3月, 2009 1 次提交
    • T
      linker script: define __per_cpu_load on all SMP capable archs · 19390c4d
      Tejun Heo 提交于
      Impact: __per_cpu_load available on all SMP capable archs
      
      Percpu now requires three symbols to be defined - __per_cpu_load,
      __per_cpu_start and __per_cpu_end.  There were three archs which
      didn't have it.  Update them as follows.
      
      * powerpc: can use generic PERCPU() macro.  Compile tested for
        powerpc32, compile/boot tested for powerpc64.
      
      * ia64: can use generic PERCPU_VADDR() macro.  __phys_per_cpu_start is
        identical to __per_cpu_load.  Compile tested and symbol table looks
        identical after the change except for the additional __per_cpu_load.
      
      * arm: added explicit __per_cpu_load definition.  Currently uses
        unified .init output section so can't use the generic macro.  Dunno
        whether the unified .init ouput section is required by arch
        peculiarity so I left it alone.  Please break it up and use PERCPU()
        if possible.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Pat Gefre <pfg@sgi.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      19390c4d
  19. 19 2月, 2009 1 次提交
  20. 17 1月, 2009 1 次提交
  21. 01 12月, 2008 1 次提交
  22. 29 1月, 2008 1 次提交
  23. 26 1月, 2008 1 次提交
  24. 20 7月, 2007 1 次提交
    • F
      define new percpu interface for shared data · 5fb7dc37
      Fenghua Yu 提交于
      per cpu data section contains two types of data.  One set which is
      exclusively accessed by the local cpu and the other set which is per cpu,
      but also shared by remote cpus.  In the current kernel, these two sets are
      not clearely separated out.  This can potentially cause the same data
      cacheline shared between the two sets of data, which will result in
      unnecessary bouncing of the cacheline between cpus.
      
      One way to fix the problem is to cacheline align the remotely accessed per
      cpu data, both at the beginning and at the end.  Because of the padding at
      both ends, this will likely cause some memory wastage and also the
      interface to achieve this is not clean.
      
      This patch:
      
      Moves the remotely accessed per cpu data (which is currently marked
      as ____cacheline_aligned_in_smp) into a different section, where all the data
      elements are cacheline aligned. And as such, this differentiates the local
      only data and remotely accessed data cleanly.
      Signed-off-by: NFenghua Yu <fenghua.yu@intel.com>
      Acked-by: NSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Christoph Lameter <clameter@sgi.com>
      Cc: <linux-arch@vger.kernel.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5fb7dc37
  25. 19 5月, 2007 2 次提交
  26. 08 5月, 2007 2 次提交
  27. 03 5月, 2007 1 次提交
  28. 22 4月, 2007 1 次提交
  29. 26 2月, 2007 1 次提交
    • N
      [ARM] 4224/2: allow XIP kernel to boot again · e98ff7f6
      Nicolas Pitre 提交于
      Since commit 2552fc27 XIP kernels failed
      to boot because (_end - PAGE_OFFSET - 1) is much smaller than the size
      of the kernel text and data in the XIP case, causing the kernel not to
      be entirely mapped.
      
      Even in the non-XIP case, the use of (_end - PAGE_OFFSET - 1) is wrong
      because it produces a too large value if TEXT_OFFSET is larger than 1MB.
      
      Finally the original code was performing one loop too many.
      
      Let's break the loop when the section pointer has passed the last byte
      of the kernel instead.
      Signed-off-by: NNicolas Pitre <nico@cam.org>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      e98ff7f6
  30. 12 2月, 2007 1 次提交
  31. 28 10月, 2006 1 次提交
  32. 01 7月, 2006 1 次提交
  33. 29 6月, 2006 1 次提交
    • R
      [ARM] nommu: uaccess tweaks · 9641c7cc
      Russell King 提交于
      MMUless systems have only one address space for all threads, so
      both the usual access_ok() checks, and the exception handling do
      not make much sense.
      
      Hence, discard the fixup and exception tables at link time, use
      memcpy/memset for the user copy/clearing functions, and define
      the permission check macros to be constants.
      
      Some of this patch was derived from the equivalent patch by
      Hyok S. Choi.
      Signed-off-by: NHyok S. Choi <hyok.choi@samsung.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      9641c7cc
  34. 04 1月, 2006 1 次提交
  35. 18 11月, 2005 1 次提交