1. 01 8月, 2016 1 次提交
  2. 28 7月, 2016 1 次提交
  3. 27 7月, 2016 1 次提交
  4. 26 7月, 2016 2 次提交
  5. 23 7月, 2016 1 次提交
    • S
      powerpc/numa: Convert to hotplug state machine · bdab88e0
      Sebastian Andrzej Siewior 提交于
      Install the callbacks via the state machine. On the boot cpu the callback is
      invoked manually because cpuhp is not up yet and everything must be
      preinitialized before additional CPUs are up.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
      Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
      Cc: Anton Blanchard <anton@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: rt@linutronix.de
      Link: http://lkml.kernel.org/r/20160718140727.GA13132@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      bdab88e0
  6. 21 7月, 2016 6 次提交
  7. 19 7月, 2016 1 次提交
  8. 17 7月, 2016 8 次提交
  9. 09 7月, 2016 4 次提交
    • C
      powerpc/8xx: add CONFIG_PIN_TLB_IMMR · 62f64b49
      Christophe Leroy 提交于
      CONFIG_PIN_TLB maps IMMR area and the first 24 Mbytes of memory.
      In some circunstances it might be more interesting to not map
      IMMR but map 32 Mbytes of memory instead.
      
      Therefore we add config option CONFIG_PIN_TLB_IMMR to select if
      IMMR shall be pinned or not, hence whether we pin 24 or 32 Mbytes of RAM
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      62f64b49
    • C
      powerpc/8xx: Rework CONFIG_PIN_TLB handling · 4ad27450
      Christophe Leroy 提交于
      On recent kernels, with some debug options like for instance
      CONFIG_LOCKDEP, the BSS requires more than 8M memory, allthough
      the kernel code fits in the first 8M.
      Today, it is necessary to activate CONFIG_PIN_TLB to get more than 8M
      at startup, allthough pinning TLB is not necessary for that.
      
      We could have inconditionaly mapped 16 or 24M bytes at startup
      but some old hardware only have 8M and mapping non-existing RAM
      would be an issue due to speculative accesses.
      
      With the preceding patch however, the TLB entries are populated on
      demand. By setting up the TLB miss handler to handle up to 24M until
      the handler is patched for the entire memory space, it is possible
      to allow access up to more memory without mapping non-existing RAM.
      
      It is therefore not needed anymore to map memory data at all
      at startup. It will be handled by the TLB miss handler.
      
      One might still want to PIN the IMMR and the first 24M of RAM.
      It is now possible to do it in the C memory initialisation
      functions. In addition, we now know how much memory we have
      when we do it, so we are able to adapt the pining to the
      real amount of memory available. So boards with less than 24M
      can now also benefit from PIN_TLB.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      4ad27450
    • C
      powerpc/8xx: Don't use page table for linear memory space · bb7f3808
      Christophe Leroy 提交于
      Instead of using the first level page table to define mappings for
      the linear memory space, we can use direct mapping from the TLB
      handling routines. This has several advantages:
      * No need to read the tables at each TLB miss
      * No issue in 16k pages mode where the 1st level table maps 64 Mbytes
      
      The size of the available linear space is known at system startup.
      In order to avoid data access at each TLB miss to know the memory
      size, the TLB routine is patched at startup with the proper size
      
      This patch provides a 10%-15% improvment of TLB miss handling for
      kernel addresses
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      bb7f3808
    • C
      powerpc/8xx: Map IMMR area with 512k page at a fixed address · 4badd43a
      Christophe Leroy 提交于
      Once the linear memory space has been mapped with 8Mb pages, as
      seen in the related commit, we get 11 millions DTLB missed during
      the reference 600s period. 77% of the misses are on user addresses
      and 23% are on kernel addresses (1 fourth for linear address space
      and 3 fourth for virtual address space)
      
      Traditionaly, each driver manages one computer board which has its
      own components with its own memory maps.
      But on embedded chips like the MPC8xx, the SOC has all registers
      located in the same IO area.
      
      When looking at ioremaps done during startup, we see that
      many drivers are re-mapping small parts of the IMMR for their own use
      and all those small pieces gets their own 4k page, amplifying the
      number of TLB misses: in our system we get 0xff000000 mapped 31 times
      and 0xff003000 mapped 9 times.
      
      Even if each part of IMMR was mapped only once with 4k pages, it would
      still be several small mappings towards linear area.
      
      This patch maps the IMMR with a single 512k page.
      
      With this patch applied, the number of DTLB misses during the 10 min
      period is reduced to 11.8 millions for a duration of 5.8s, which
      represents 2% of the non-idle time hence yet another 10% reduction.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      4badd43a
  10. 07 7月, 2016 2 次提交
  11. 05 7月, 2016 1 次提交
  12. 30 6月, 2016 1 次提交
  13. 25 6月, 2016 2 次提交
    • M
      powerpc: get rid of superfluous __GFP_REPEAT · 2379a23e
      Michal Hocko 提交于
      __GFP_REPEAT has a rather weak semantic but since it has been introduced
      around 2.6.12 it has been ignored for low order allocations.
      
      {pud,pmd}_alloc_one are allocating from {PGT,PUD}_CACHE initialized in
      pgtable_cache_init which doesn't have larger than sizeof(void *) << 12
      size and that fits into !costly allocation request size.
      
      PGALLOC_GFP is used only in radix__pgd_alloc which uses either order-0
      or order-4 requests.  The first one doesn't need the flag while the
      second does.  Drop __GFP_REPEAT from PGALLOC_GFP and add it for the
      order-4 one.
      
      This means that this flag has never been actually useful here because it
      has always been used only for PAGE_ALLOC_COSTLY requests.
      
      Link: http://lkml.kernel.org/r/1464599699-30131-12-git-send-email-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2379a23e
    • M
      tree wide: get rid of __GFP_REPEAT for order-0 allocations part I · 32d6bd90
      Michal Hocko 提交于
      This is the third version of the patchset previously sent [1].  I have
      basically only rebased it on top of 4.7-rc1 tree and dropped "dm: get
      rid of superfluous gfp flags" which went through dm tree.  I am sending
      it now because it is tree wide and chances for conflicts are reduced
      considerably when we want to target rc2.  I plan to send the next step
      and rename the flag and move to a better semantic later during this
      release cycle so we will have a new semantic ready for 4.8 merge window
      hopefully.
      
      Motivation:
      
      While working on something unrelated I've checked the current usage of
      __GFP_REPEAT in the tree.  It seems that a majority of the usage is and
      always has been bogus because __GFP_REPEAT has always been about costly
      high order allocations while we are using it for order-0 or very small
      orders very often.  It seems that a big pile of them is just a
      copy&paste when a code has been adopted from one arch to another.
      
      I think it makes some sense to get rid of them because they are just
      making the semantic more unclear.  Please note that GFP_REPEAT is
      documented as
      
      * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt
      
      * _might_ fail.  This depends upon the particular VM implementation.
        while !costly requests have basically nofail semantic.  So one could
        reasonably expect that order-0 request with __GFP_REPEAT will not loop
        for ever.  This is not implemented right now though.
      
      I would like to move on with __GFP_REPEAT and define a better semantic
      for it.
      
        $ git grep __GFP_REPEAT origin/master | wc -l
        111
        $ git grep __GFP_REPEAT | wc -l
        36
      
      So we are down to the third after this patch series.  The remaining
      places really seem to be relying on __GFP_REPEAT due to large allocation
      requests.  This still needs some double checking which I will do later
      after all the simple ones are sorted out.
      
      I am touching a lot of arch specific code here and I hope I got it right
      but as a matter of fact I even didn't compile test for some archs as I
      do not have cross compiler for them.  Patches should be quite trivial to
      review for stupid compile mistakes though.  The tricky parts are usually
      hidden by macro definitions and thats where I would appreciate help from
      arch maintainers.
      
      [1] http://lkml.kernel.org/r/1461849846-27209-1-git-send-email-mhocko@kernel.org
      
      This patch (of 19):
      
      __GFP_REPEAT has a rather weak semantic but since it has been introduced
      around 2.6.12 it has been ignored for low order allocations.  Yet we
      have the full kernel tree with its usage for apparently order-0
      allocations.  This is really confusing because __GFP_REPEAT is
      explicitly documented to allow allocation failures which is a weaker
      semantic than the current order-0 has (basically nofail).
      
      Let's simply drop __GFP_REPEAT from those places.  This would allow to
      identify place which really need allocator to retry harder and formulate
      a more specific semantic for what the flag is supposed to do actually.
      
      Link: http://lkml.kernel.org/r/1464599699-30131-2-git-send-email-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Chris Metcalf <cmetcalf@mellanox.com> [for tile]
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: John Crispin <blogic@openwrt.org>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      32d6bd90
  14. 17 6月, 2016 2 次提交
  15. 16 6月, 2016 1 次提交
    • O
      powerpc/mm: Ensure "special" zones are empty · 3079abe5
      Oliver O'Halloran 提交于
      The mm zone mechanism was traditionally used by arch specific code to
      partition memory into allocation zones. However there are several zones
      that are managed by the mm subsystem rather than the architecture. Most
      architectures set the max PFN of these special zones to zero, however on
      powerpc we set them to ~0ul. This, in conjunction with a bug in
      free_area_init_nodes() results in all of system memory being placed in
      ZONE_DEVICE when enabled. Device memory cannot be used for regular kernel
      memory allocations so this will cause a kernel panic at boot. Given the
      planned addition of more mm managed zones (ZONE_CMA) we should aim to be
      consistent with every other architecture and set the max PFN for these
      zones to zero.
      Signed-off-by: NOliver O'Halloran <oohall@gmail.com>
      Reviewed-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3079abe5
  16. 14 6月, 2016 4 次提交
  17. 10 6月, 2016 2 次提交