1. 13 12月, 2012 40 次提交
    • D
      mm, memcg: avoid unnecessary function call when memcg is disabled · 68ae564b
      David Rientjes 提交于
      While profiling numa/core v16 with cgroup_disable=memory on the command
      line, I noticed mem_cgroup_count_vm_event() still showed up as high as
      0.60% in perftop.
      
      This occurs because the function is called extremely often even when memcg
      is disabled.
      
      To fix this, inline the check for mem_cgroup_disabled() so we avoid the
      unnecessary function call if memcg is disabled.
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: NGlauber Costa <glommer@parallels.com>
      Acked-by: NMichal Hocko <mhocko@suse.cz>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      68ae564b
    • A
      mm: add a reminder comment for __GFP_BITS_SHIFT · 05b0afd7
      Andrew Morton 提交于
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05b0afd7
    • J
      mm: WARN_ON_ONCE if f_op->mmap() change vma's start address · 2897b4d2
      Joonsoo Kim 提交于
      During reviewing the source code, I found a comment which mention that
      after f_op->mmap(), vma's start address can be changed.  I didn't verify
      that it is really possible, because there are so many f_op->mmap()
      implementation.  But if there are some mmap() which change vma's start
      address, it is possible error situation, because we already prepare prev
      vma, rb_link and rb_parent and these are related to original address.
      
      So add WARN_ON_ONCE for finding that this situtation really happens.
      Signed-off-by: NJoonsoo Kim <js1304@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2897b4d2
    • G
      res_counter: delete res_counter_write() · 44e33e8f
      Greg Thelen 提交于
      Since commit 628f4235 ("memcg: limit change shrink usage") both
      res_counter_write() and write_strategy_fn have been unused.  This patch
      deletes them both.
      Signed-off-by: NGreg Thelen <gthelen@google.com>
      Cc: Glauber Costa <glommer@parallels.com>
      Cc: Tejun Heo <tj@kernel.org>
      Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Frederic Weisbecker <fweisbec@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      44e33e8f
    • L
      hotplug: update nodemasks management · 6715ddf9
      Lai Jiangshan 提交于
      Update nodemasks management for N_MEMORY.
      
      [lliubbo@gmail.com: fix build]
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NBob Liu <lliubbo@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6715ddf9
    • L
      page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization · 4b0ef1fe
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      
      Since we introduced N_MEMORY, we update the initialization of node_states.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NLin Feng <linfeng@cn.fujitsu.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4b0ef1fe
    • L
      vmscan: use N_MEMORY instead N_HIGH_MEMORY · 48fb2e24
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NHillf Danton <dhillf@gmail.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      48fb2e24
    • L
      init: use N_MEMORY instead N_HIGH_MEMORY · 3c466d46
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c466d46
    • L
      kthread: use N_MEMORY instead N_HIGH_MEMORY · aee4faa4
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      aee4faa4
    • L
      vmstat: use N_MEMORY instead N_HIGH_MEMORY · a47b53c5
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a47b53c5
    • L
      hugetlb: use N_MEMORY instead N_HIGH_MEMORY · 8cebfcd0
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NHillf Danton <dhillf@gmail.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8cebfcd0
    • L
      mempolicy: use N_MEMORY instead N_HIGH_MEMORY · 01f13bd6
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      01f13bd6
    • L
      mm,migrate: use N_MEMORY instead N_HIGH_MEMORY · 389162c2
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      389162c2
    • L
      oom: use N_MEMORY instead N_HIGH_MEMORY · bd3a66c1
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NHillf Danton <dhillf@gmail.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bd3a66c1
    • L
      memcontrol: use N_MEMORY instead N_HIGH_MEMORY · 31aaea4a
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      31aaea4a
    • L
      procfs: use N_MEMORY instead N_HIGH_MEMORY · 4ff1b2c2
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NHillf Danton <dhillf@gmail.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4ff1b2c2
    • L
      cpuset: use N_MEMORY instead N_HIGH_MEMORY · 38d7bee9
      Lai Jiangshan 提交于
      N_HIGH_MEMORY stands for the nodes that has normal or high memory.
      N_MEMORY stands for the nodes that has any memory.
      
      The code here need to handle with the nodes which have memory, we should
      use N_MEMORY instead.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NHillf Danton <dhillf@gmail.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      38d7bee9
    • L
      mm: node_states: introduce N_MEMORY · 8219fc48
      Lai Jiangshan 提交于
      We have N_NORMAL_MEMORY for standing for the nodes that have normal memory
      with zone_type <= ZONE_NORMAL.
      
      And we have N_HIGH_MEMORY for standing for the nodes that have normal or
      high memory.
      
      But we don't have any word to stand for the nodes that have *any* memory.
      
      And we have N_CPU but without N_MEMORY.
      
      Current code reuse the N_HIGH_MEMORY for this purpose because any node
      which has memory must have high memory or normal memory currently.
      
      A)	But this reusing is bad for *readability*. Because the name
      	N_HIGH_MEMORY just stands for high or normal:
      
      A.example 1)
      	mem_cgroup_nr_lru_pages():
      		for_each_node_state(nid, N_HIGH_MEMORY)
      
      	The user will be confused(why this function just counts for high or
      	normal memory node? does it counts for ZONE_MOVABLE's lru pages?)
      	until someone else tell them N_HIGH_MEMORY is reused to stand for
      	nodes that have any memory.
      
      A.cont) If we introduce N_MEMORY, we can reduce this confusing
      	AND make the code more clearly:
      
      A.example 2) mm/page_cgroup.c use N_HIGH_MEMORY twice:
      
      	One is in page_cgroup_init(void):
      		for_each_node_state(nid, N_HIGH_MEMORY) {
      
      	It means if the node have memory, we will allocate page_cgroup map for
      	the node. We should use N_MEMORY instead here to gaim more clearly.
      
      	The second using is in alloc_page_cgroup():
      		if (node_state(nid, N_HIGH_MEMORY))
      			addr = vzalloc_node(size, nid);
      
      	It means if the node has high or normal memory that can be allocated
      	from kernel. We should keep N_HIGH_MEMORY here, and it will be better
      	if the "any memory" semantic of N_HIGH_MEMORY is removed.
      
      B)	This reusing is out-dated if we introduce MOVABLE-dedicated node.
      	The MOVABLE-dedicated node should not appear in
      	node_stats[N_HIGH_MEMORY] nor node_stats[N_NORMAL_MEMORY],
      	because MOVABLE-dedicated node has no high or normal memory.
      
      	In x86_64, N_HIGH_MEMORY=N_NORMAL_MEMORY, if a MOVABLE-dedicated node
      	is in node_stats[N_HIGH_MEMORY], it is also means it is in
      	node_stats[N_NORMAL_MEMORY], it causes SLUB wrong.
      
      	The slub uses
      		for_each_node_state(nid, N_NORMAL_MEMORY)
      	and creates kmem_cache_node for MOVABLE-dedicated node and cause problem.
      
      In one word, we need a N_MEMORY.  We just intrude it as an alias to
      N_HIGH_MEMORY and fix all im-proper usages of N_HIGH_MEMORY in late
      patches.
      Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
      Acked-by: NChristoph Lameter <cl@linux.com>
      Acked-by: NHillf Danton <dhillf@gmail.com>
      Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
      Cc: Lin Feng <linfeng@cn.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8219fc48
    • M
      mm: use migrate_prep() instead of migrate_prep_local() · be49a6e1
      Marek Szyprowski 提交于
      __alloc_contig_migrate_range() should use all possible ways to get all the
      pages migrated from the given memory range, so pruning per-cpu lru lists
      for all CPUs is required, regadless the cost of such operation.  Otherwise
      some pages which got stuck at per-cpu lru list might get missed by
      migration procedure causing the contiguous allocation to fail.
      Reported-by: NSeongHwan Yoon <sunghwan.yun@samsung.com>
      Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
      Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
      Acked-by: NMichal Nazarewicz <mina86@mina86.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      be49a6e1
    • T
      mm: compaction: Fix compiler warning · c8bf2d8b
      Thierry Reding 提交于
      compact_capture_page() is only used if compaction is enabled so it should
      be moved into the corresponding #ifdef.
      Signed-off-by: NThierry Reding <thierry.reding@avionic-design.de>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c8bf2d8b
    • K
      thp: avoid race on multiple parallel page faults to the same page · 3ea41e62
      Kirill A. Shutemov 提交于
      pmd value is stable only with mm->page_table_lock taken. After taking
      the lock we need to check that nobody modified the pmd before changing it.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Reviewed-by: NBob Liu <lliubbo@gmail.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3ea41e62
    • K
      thp: introduce sysfs knob to disable huge zero page · 79da5407
      Kirill A. Shutemov 提交于
      By default kernel tries to use huge zero page on read page fault.  It's
      possible to disable huge zero page by writing 0 or enable it back by
      writing 1:
      
      echo 0 >/sys/kernel/mm/transparent_hugepage/khugepaged/use_zero_page
      echo 1 >/sys/kernel/mm/transparent_hugepage/khugepaged/use_zero_page
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      79da5407
    • K
      thp, vmstat: implement HZP_ALLOC and HZP_ALLOC_FAILED events · d8a8e1f0
      Kirill A. Shutemov 提交于
      hzp_alloc is incremented every time a huge zero page is successfully
      	allocated. It includes allocations which where dropped due
      	race with other allocation. Note, it doesn't count every map
      	of the huge zero page, only its allocation.
      
      hzp_alloc_failed is incremented if kernel fails to allocate huge zero
      	page and falls back to using small pages.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d8a8e1f0
    • K
      thp: implement refcounting for huge zero page · 97ae1749
      Kirill A. Shutemov 提交于
      H.  Peter Anvin doesn't like huge zero page which sticks in memory forever
      after the first allocation.  Here's implementation of lockless refcounting
      for huge zero page.
      
      We have two basic primitives: {get,put}_huge_zero_page(). They
      manipulate reference counter.
      
      If counter is 0, get_huge_zero_page() allocates a new huge page and takes
      two references: one for caller and one for shrinker.  We free the page
      only in shrinker callback if counter is 1 (only shrinker has the
      reference).
      
      put_huge_zero_page() only decrements counter.  Counter is never zero in
      put_huge_zero_page() since shrinker holds on reference.
      
      Freeing huge zero page in shrinker callback helps to avoid frequent
      allocate-free.
      
      Refcounting has cost.  On 4 socket machine I observe ~1% slowdown on
      parallel (40 processes) read page faulting comparing to lazy huge page
      allocation.  I think it's pretty reasonable for synthetic benchmark.
      
      [lliubbo@gmail.com: fix mismerge]
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NBob Liu <lliubbo@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      97ae1749
    • K
      thp: lazy huge zero page allocation · 78ca0e67
      Kirill A. Shutemov 提交于
      Instead of allocating huge zero page on hugepage_init() we can postpone it
      until first huge zero page map. It saves memory if THP is not in use.
      
      cmpxchg() is used to avoid race on huge_zero_pfn initialization.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      78ca0e67
    • K
      thp: setup huge zero page on non-write page fault · 80371957
      Kirill A. Shutemov 提交于
      All code paths seems covered. Now we can map huge zero page on read page
      fault.
      
      We setup it in do_huge_pmd_anonymous_page() if area around fault address
      is suitable for THP and we've got read page fault.
      
      If we fail to setup huge zero page (ENOMEM) we fallback to
      handle_pte_fault() as we normally do in THP.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      80371957
    • K
      thp: implement splitting pmd for huge zero page · c5a647d0
      Kirill A. Shutemov 提交于
      We can't split huge zero page itself (and it's bug if we try), but we
      can split the pmd which points to it.
      
      On splitting the pmd we create a table with all ptes set to normal zero
      page.
      
      [akpm@linux-foundation.org: fix build error]
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c5a647d0
    • K
      thp: change split_huge_page_pmd() interface · e180377f
      Kirill A. Shutemov 提交于
      Pass vma instead of mm and add address parameter.
      
      In most cases we already have vma on the stack. We provides
      split_huge_page_pmd_mm() for few cases when we have mm, but not vma.
      
      This change is preparation to huge zero pmd splitting implementation.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e180377f
    • K
      thp: change_huge_pmd(): make sure we don't try to make a page writable · cad7f613
      Kirill A. Shutemov 提交于
      mprotect core never tries to make page writable using change_huge_pmd().
      Let's add an assert that the assumption is true.  It's important to be
      sure we will not make huge zero page writable.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cad7f613
    • K
      thp: do_huge_pmd_wp_page(): handle huge zero page · 93b4796d
      Kirill A. Shutemov 提交于
      On write access to huge zero page we alloc a new huge page and clear it.
      
      If ENOMEM, graceful fallback: we create a new pmd table and set pte around
      fault address to newly allocated normal (4k) page.  All other ptes in the
      pmd set to normal zero page.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      93b4796d
    • K
      thp: copy_huge_pmd(): copy huge zero page · fc9fe822
      Kirill A. Shutemov 提交于
      It's easy to copy huge zero page. Just set destination pmd to huge zero
      page.
      
      It's safe to copy huge zero page since we have none yet :-p
      
      [rientjes@google.com: fix comment]
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Signed-off-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      fc9fe822
    • K
      thp: zap_huge_pmd(): zap huge zero pmd · 479f0abb
      Kirill A. Shutemov 提交于
      We don't have a mapped page to zap in huge zero page case.  Let's just clear
      pmd and remove it from tlb.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      479f0abb
    • K
      thp: huge zero page: basic preparation · 4a6c1297
      Kirill A. Shutemov 提交于
      During testing I noticed big (up to 2.5 times) memory consumption overhead
      on some workloads (e.g.  ft.A from NPB) if THP is enabled.
      
      The main reason for that big difference is lacking zero page in THP case.
      We have to allocate a real page on read page fault.
      
      A program to demonstrate the issue:
      #include <assert.h>
      #include <stdlib.h>
      #include <unistd.h>
      
      #define MB 1024*1024
      
      int main(int argc, char **argv)
      {
              char *p;
              int i;
      
              posix_memalign((void **)&p, 2 * MB, 200 * MB);
              for (i = 0; i < 200 * MB; i+= 4096)
                      assert(p[i] == 0);
              pause();
              return 0;
      }
      
      With thp-never RSS is about 400k, but with thp-always it's 200M.  After
      the patcheset thp-always RSS is 400k too.
      
      Design overview.
      
      Huge zero page (hzp) is a non-movable huge page (2M on x86-64) filled with
      zeros.  The way how we allocate it changes in the patchset:
      
      - [01/10] simplest way: hzp allocated on boot time in hugepage_init();
      - [09/10] lazy allocation on first use;
      - [10/10] lockless refcounting + shrinker-reclaimable hzp;
      
      We setup it in do_huge_pmd_anonymous_page() if area around fault address
      is suitable for THP and we've got read page fault.  If we fail to setup
      hzp (ENOMEM) we fallback to handle_pte_fault() as we normally do in THP.
      
      On wp fault to hzp we allocate real memory for the huge page and clear it.
       If ENOMEM, graceful fallback: we create a new pmd table and set pte
      around fault address to newly allocated normal (4k) page.  All other ptes
      in the pmd set to normal zero page.
      
      We cannot split hzp (and it's bug if we try), but we can split the pmd
      which points to it.  On splitting the pmd we create a table with all ptes
      set to normal zero page.
      
      ===
      
      By hpa's request I've tried alternative approach for hzp implementation
      (see Virtual huge zero page patchset): pmd table with all entries set to
      zero page.  This way should be more cache friendly, but it increases TLB
      pressure.
      
      The problem with virtual huge zero page: it requires per-arch enabling.
      We need a way to mark that pmd table has all ptes set to zero page.
      
      Some numbers to compare two implementations (on 4s Westmere-EX):
      
      Mirobenchmark1
      ==============
      
      test:
              posix_memalign((void **)&p, 2 * MB, 8 * GB);
              for (i = 0; i < 100; i++) {
                      assert(memcmp(p, p + 4*GB, 4*GB) == 0);
                      asm volatile ("": : :"memory");
              }
      
      hzp:
       Performance counter stats for './test_memcmp' (5 runs):
      
            32356.272845 task-clock                #    0.998 CPUs utilized            ( +-  0.13% )
                      40 context-switches          #    0.001 K/sec                    ( +-  0.94% )
                       0 CPU-migrations            #    0.000 K/sec
                   4,218 page-faults               #    0.130 K/sec                    ( +-  0.00% )
          76,712,481,765 cycles                    #    2.371 GHz                      ( +-  0.13% ) [83.31%]
          36,279,577,636 stalled-cycles-frontend   #   47.29% frontend cycles idle     ( +-  0.28% ) [83.35%]
           1,684,049,110 stalled-cycles-backend    #    2.20% backend  cycles idle     ( +-  2.96% ) [66.67%]
         134,355,715,816 instructions              #    1.75  insns per cycle
                                                   #    0.27  stalled cycles per insn  ( +-  0.10% ) [83.35%]
          13,526,169,702 branches                  #  418.039 M/sec                    ( +-  0.10% ) [83.31%]
               1,058,230 branch-misses             #    0.01% of all branches          ( +-  0.91% ) [83.36%]
      
            32.413866442 seconds time elapsed                                          ( +-  0.13% )
      
      vhzp:
       Performance counter stats for './test_memcmp' (5 runs):
      
            30327.183829 task-clock                #    0.998 CPUs utilized            ( +-  0.13% )
                      38 context-switches          #    0.001 K/sec                    ( +-  1.53% )
                       0 CPU-migrations            #    0.000 K/sec
                   4,218 page-faults               #    0.139 K/sec                    ( +-  0.01% )
          71,964,773,660 cycles                    #    2.373 GHz                      ( +-  0.13% ) [83.35%]
          31,191,284,231 stalled-cycles-frontend   #   43.34% frontend cycles idle     ( +-  0.40% ) [83.32%]
             773,484,474 stalled-cycles-backend    #    1.07% backend  cycles idle     ( +-  6.61% ) [66.67%]
         134,982,215,437 instructions              #    1.88  insns per cycle
                                                   #    0.23  stalled cycles per insn  ( +-  0.11% ) [83.32%]
          13,509,150,683 branches                  #  445.447 M/sec                    ( +-  0.11% ) [83.34%]
               1,017,667 branch-misses             #    0.01% of all branches          ( +-  1.07% ) [83.32%]
      
            30.381324695 seconds time elapsed                                          ( +-  0.13% )
      
      Mirobenchmark2
      ==============
      
      test:
              posix_memalign((void **)&p, 2 * MB, 8 * GB);
              for (i = 0; i < 1000; i++) {
                      char *_p = p;
                      while (_p < p+4*GB) {
                              assert(*_p == *(_p+4*GB));
                              _p += 4096;
                              asm volatile ("": : :"memory");
                      }
              }
      
      hzp:
       Performance counter stats for 'taskset -c 0 ./test_memcmp2' (5 runs):
      
             3505.727639 task-clock                #    0.998 CPUs utilized            ( +-  0.26% )
                       9 context-switches          #    0.003 K/sec                    ( +-  4.97% )
                   4,384 page-faults               #    0.001 M/sec                    ( +-  0.00% )
           8,318,482,466 cycles                    #    2.373 GHz                      ( +-  0.26% ) [33.31%]
           5,134,318,786 stalled-cycles-frontend   #   61.72% frontend cycles idle     ( +-  0.42% ) [33.32%]
           2,193,266,208 stalled-cycles-backend    #   26.37% backend  cycles idle     ( +-  5.51% ) [33.33%]
           9,494,670,537 instructions              #    1.14  insns per cycle
                                                   #    0.54  stalled cycles per insn  ( +-  0.13% ) [41.68%]
           2,108,522,738 branches                  #  601.451 M/sec                    ( +-  0.09% ) [41.68%]
                 158,746 branch-misses             #    0.01% of all branches          ( +-  1.60% ) [41.71%]
           3,168,102,115 L1-dcache-loads
                #  903.693 M/sec                    ( +-  0.11% ) [41.70%]
           1,048,710,998 L1-dcache-misses
               #   33.10% of all L1-dcache hits    ( +-  0.11% ) [41.72%]
           1,047,699,685 LLC-load
                       #  298.854 M/sec                    ( +-  0.03% ) [33.38%]
                   2,287 LLC-misses
                     #    0.00% of all LL-cache hits     ( +-  8.27% ) [33.37%]
           3,166,187,367 dTLB-loads
                     #  903.147 M/sec                    ( +-  0.02% ) [33.35%]
               4,266,538 dTLB-misses
                    #    0.13% of all dTLB cache hits   ( +-  0.03% ) [33.33%]
      
             3.513339813 seconds time elapsed                                          ( +-  0.26% )
      
      vhzp:
       Performance counter stats for 'taskset -c 0 ./test_memcmp2' (5 runs):
      
            27313.891128 task-clock                #    0.998 CPUs utilized            ( +-  0.24% )
                      62 context-switches          #    0.002 K/sec                    ( +-  0.61% )
                   4,384 page-faults               #    0.160 K/sec                    ( +-  0.01% )
          64,747,374,606 cycles                    #    2.370 GHz                      ( +-  0.24% ) [33.33%]
          61,341,580,278 stalled-cycles-frontend   #   94.74% frontend cycles idle     ( +-  0.26% ) [33.33%]
          56,702,237,511 stalled-cycles-backend    #   87.57% backend  cycles idle     ( +-  0.07% ) [33.33%]
          10,033,724,846 instructions              #    0.15  insns per cycle
                                                   #    6.11  stalled cycles per insn  ( +-  0.09% ) [41.65%]
           2,190,424,932 branches                  #   80.195 M/sec                    ( +-  0.12% ) [41.66%]
               1,028,630 branch-misses             #    0.05% of all branches          ( +-  1.50% ) [41.66%]
           3,302,006,540 L1-dcache-loads
                #  120.891 M/sec                    ( +-  0.11% ) [41.68%]
             271,374,358 L1-dcache-misses
               #    8.22% of all L1-dcache hits    ( +-  0.04% ) [41.66%]
              20,385,476 LLC-load
                       #    0.746 M/sec                    ( +-  1.64% ) [33.34%]
                  76,754 LLC-misses
                     #    0.38% of all LL-cache hits     ( +-  2.35% ) [33.34%]
           3,309,927,290 dTLB-loads
                     #  121.181 M/sec                    ( +-  0.03% ) [33.34%]
           2,098,967,427 dTLB-misses
                    #   63.41% of all dTLB cache hits   ( +-  0.03% ) [33.34%]
      
            27.364448741 seconds time elapsed                                          ( +-  0.24% )
      
      ===
      
      I personally prefer implementation present in this patchset. It doesn't
      touch arch-specific code.
      
      This patch:
      
      Huge zero page (hzp) is a non-movable huge page (2M on x86-64) filled with
      zeros.
      
      For now let's allocate the page on hugepage_init().  We'll switch to lazy
      allocation later.
      
      We are not going to map the huge zero page until we can handle it properly
      on all code paths.
      
      is_huge_zero_{pfn,pmd}() functions will be used by following patches to
      check whether the pfn/pmd is huge zero page.
      Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: "H. Peter Anvin" <hpa@linux.intel.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Acked-by: NDavid Rientjes <rientjes@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4a6c1297
    • J
      bootmem: remove alloc_arch_preferred_bootmem() · 3f7dfe24
      Joonsoo Kim 提交于
      The name of this function is not suitable, and removing the function and
      open-coding it into each call sites makes the code more understandable.
      
      Additionally, we shouldn't do an allocation from bootmem when
      slab_is_available(), so directly return kmalloc()'s return value.
      Signed-off-by: NJoonsoo Kim <js1304@gmail.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3f7dfe24
    • J
      bootmem: remove not implemented function call, bootmem_arch_preferred_node() · 2d7a6956
      Joonsoo Kim 提交于
      There is no implementation of bootmem_arch_preferred_node() and a call to
      this function will cause a compilation error.  So remove it.
      Signed-off-by: NJoonsoo Kim <js1304@gmail.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      2d7a6956
    • L
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal · 9977d9b3
      Linus Torvalds 提交于
      Pull big execve/kernel_thread/fork unification series from Al Viro:
       "All architectures are converted to new model.  Quite a bit of that
        stuff is actually shared with architecture trees; in such cases it's
        literally shared branch pulled by both, not a cherry-pick.
      
        A lot of ugliness and black magic is gone (-3KLoC total in this one):
      
         - kernel_thread()/kernel_execve()/sys_execve() redesign.
      
           We don't do syscalls from kernel anymore for either kernel_thread()
           or kernel_execve():
      
           kernel_thread() is essentially clone(2) with callback run before we
           return to userland, the callbacks either never return or do
           successful do_execve() before returning.
      
           kernel_execve() is a wrapper for do_execve() - it doesn't need to
           do transition to user mode anymore.
      
           As a result kernel_thread() and kernel_execve() are
           arch-independent now - they live in kernel/fork.c and fs/exec.c
           resp.  sys_execve() is also in fs/exec.c and it's completely
           architecture-independent.
      
         - daemonize() is gone, along with its parts in fs/*.c
      
         - struct pt_regs * is no longer passed to do_fork/copy_process/
           copy_thread/do_execve/search_binary_handler/->load_binary/do_coredump.
      
         - sys_fork()/sys_vfork()/sys_clone() unified; some architectures
           still need wrappers (ones with callee-saved registers not saved in
           pt_regs on syscall entry), but the main part of those suckers is in
           kernel/fork.c now."
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (113 commits)
        do_coredump(): get rid of pt_regs argument
        print_fatal_signal(): get rid of pt_regs argument
        ptrace_signal(): get rid of unused arguments
        get rid of ptrace_signal_deliver() arguments
        new helper: signal_pt_regs()
        unify default ptrace_signal_deliver
        flagday: kill pt_regs argument of do_fork()
        death to idle_regs()
        don't pass regs to copy_process()
        flagday: don't pass regs to copy_thread()
        bfin: switch to generic vfork, get rid of pointless wrappers
        xtensa: switch to generic clone()
        openrisc: switch to use of generic fork and clone
        unicore32: switch to generic clone(2)
        score: switch to generic fork/vfork/clone
        c6x: sanitize copy_thread(), get rid of clone(2) wrapper, switch to generic clone()
        take sys_fork/sys_vfork/sys_clone prototypes to linux/syscalls.h
        mn10300: switch to generic fork/vfork/clone
        h8300: switch to generic fork/vfork/clone
        tile: switch to generic clone()
        ...
      
      Conflicts:
      	arch/microblaze/include/asm/Kbuild
      9977d9b3
    • L
      Merge tag 'boards' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · cf4af012
      Linus Torvalds 提交于
      Pull ARM SoC board updates from Olof Johansson:
       "This branch contains a set of various board updates for ARM platforms.
      
        A few shmobile platforms that are stale have been removed, some
        defconfig updates for various boards selecting new features such as
        pinctrl subsystem support, and various updates enabling peripherals,
        etc."
      
      Fix up conflicts mostly as per Olof.
      
      * tag 'boards' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (58 commits)
        ARM: S3C64XX: Add dummy supplies for Glenfarclas LDOs
        ARM: S3C64XX: Add registration of WM2200 Bells device on Cragganmore
        ARM: kirkwood: Add Plat'Home OpenBlocks A6 support
        ARM: Dove: update defconfig
        ARM: Kirkwood: update defconfig for new boards
        arm: orion5x: add DT related options in defconfig
        arm: orion5x: convert 'LaCie Ethernet Disk mini v2' to Device Tree
        arm: orion5x: basic Device Tree support
        arm: orion5x: mechanical defconfig update
        ARM: kirkwood: Add support for the MPL CEC4
        arm: kirkwood: add support for ZyXEL NSA310
        ARM: Kirkwood: new board USI Topkick
        ARM: kirkwood: use gpio-fan DT binding on lsxl
        ARM: Kirkwood: add Netspace boards to defconfig
        ARM: kirkwood: DT board setup for Network Space Mini v2
        ARM: kirkwood: DT board setup for Network Space Lite v2
        ARM: kirkwood: DT board setup for Network Space v2 and parents
        leds: leds-ns2: add device tree binding
        ARM: Kirkwood: Enable the second I2C bus
        ARM: mmp: select pinctrl driver
        ...
      cf4af012
    • L
      Merge tag 'soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · d027db13
      Linus Torvalds 提交于
      Pull ARM SoC updates from Olof Johansson:
       "This contains the bulk of new SoC development for this merge window.
      
        Two new platforms have been added, the sunxi platforms (Allwinner A1x
        SoCs) by Maxime Ripard, and a generic Broadcom platform for a new
        series of ARMv7 platforms from them, where the hope is that we can
        keep the platform code generic enough to have them all share one mach
        directory.  The new Broadcom platform is contributed by Christian
        Daudt.
      
        Highbank has grown support for Calxeda's next generation of hardware,
        ECX-2000.
      
        clps711x has seen a lot of cleanup from Alexander Shiyan, and he's
        also taken on maintainership of the platform.
      
        Beyond this there has been a bunch of work from a number of people on
        converting more platforms to IRQ domains, pinctrl conversion, cleanup
        and general feature enablement across most of the active platforms."
      
      Fix up trivial conflicts as per Olof.
      
      * tag 'soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (174 commits)
        mfd: vexpress-sysreg: Remove LEDs code
        irqchip: irq-sunxi: Add terminating entry for sunxi_irq_dt_ids
        clocksource: sunxi_timer: Add terminating entry for sunxi_timer_dt_ids
        irq: versatile: delete dangling variable
        ARM: sunxi: add missing include for mdelay()
        ARM: EXYNOS: Avoid early use of of_machine_is_compatible()
        ARM: dts: add node for PL330 MDMA1 controller for exynos4
        ARM: EXYNOS: Add support for secondary CPU bring-up on Exynos4412
        ARM: EXYNOS: add UART3 to DEBUG_LL ports
        ARM: S3C24XX: Add clkdev entry for camif-upll clock
        ARM: SAMSUNG: Add s3c24xx/s3c64xx CAMIF GPIO setup helpers
        ARM: sunxi: Add missing sun4i.dtsi file
        pinctrl: samsung: Do not initialise statics to 0
        ARM i.MX6: remove gate_mask from pllv3
        ARM i.MX6: Fix ethernet PLL clocks
        ARM i.MX6: rename PLLs according to datasheet
        ARM i.MX6: Add pwm support
        ARM i.MX51: Add pwm support
        ARM i.MX53: Add pwm support
        ARM: mx5: Replace clk_register_clkdev with clock DT lookup
        ...
      d027db13
    • L
      Merge tag 'cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · d01e4afd
      Linus Torvalds 提交于
      Pull ARM SoC cleanups on various subarchitectures from Olof Johansson:
       "Cleanup patches for various ARM platforms and some of their associated
        drivers.  There's also a branch in here that enables Freescale i.MX to
        be part of the multiplatform support -- the first "big" SoC that is
        moved over (more multiplatform work comes in a separate branch later
        during the merge window)."
      
      Conflicts fixed as per Olof, including a silent semantic one in
      arch/arm/mach-omap2/board-generic.c (omap_prcm_restart() was renamed to
      omap3xxx_restart(), and a new user of the old name was added).
      
      * tag 'cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (189 commits)
        ARM: omap: fix typo on timer cleanup
        ARM: EXYNOS: Remove unused regs-mem.h file
        ARM: EXYNOS: Remove unused non-dt support for dwmci controller
        ARM: Kirkwood: Use hw_pci.ops instead of hw_pci.scan
        ARM: OMAP3: cm-t3517: use GPTIMER for system clock
        ARM: OMAP2+: timer: remove CONFIG_OMAP_32K_TIMER
        ARM: SAMSUNG: use devm_ functions for ADC driver
        ARM: EXYNOS: no duplicate mask/unmask in eint0_15
        ARM: S3C24XX: SPI clock channel setup is fixed for S3C2443
        ARM: EXYNOS: Remove i2c0 resource information and setting of device names
        ARM: Kirkwood: checkpatch cleanups
        ARM: Kirkwood: Fix sparse warnings.
        ARM: Kirkwood: Remove unused includes
        ARM: kirkwood: cleanup lsxl board includes
        ARM: integrator: use BUG_ON where possible
        ARM: integrator: push down SC dependencies
        ARM: integrator: delete static UART1 mapping
        ARM: integrator: delete SC mapping on the CP
        ARM: integrator: remove static CP syscon mapping
        ARM: integrator: remove static AP syscon mapping
        ...
      d01e4afd
    • L
      Merge tag 'headers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 8287361a
      Linus Torvalds 提交于
      Pull ARM SoC Header cleanups from Olof Johansson:
       "This is a collection of header file cleanups, mostly for OMAP and
        AT91, that keeps moving the platforms in the direction of
        multiplatform by removing the need for mach-dependent header files
        used in drivers and other places."
      
      Fix up mostly trivial conflicts as per Olof.
      
      * tag 'headers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (106 commits)
        ARM: OMAP2+: Move iommu/iovmm headers to platform_data
        ARM: OMAP2+: Make some definitions local
        ARM: OMAP2+: Move iommu2 to drivers/iommu/omap-iommu2.c
        ARM: OMAP2+: Move plat/iovmm.h to include/linux/omap-iommu.h
        ARM: OMAP2+: Move iopgtable header to drivers/iommu/
        ARM: OMAP: Merge iommu2.h into iommu.h
        atmel: move ATMEL_MAX_UART to platform_data/atmel.h
        ARM: OMAP: Remove omap_init_consistent_dma_size()
        arm: at91: move at91rm9200 rtc header in drivers/rtc
        arm: at91: move reset controller header to arm/arm/mach-at91
        arm: at91: move pit define to the driver
        arm: at91: move at91_shdwc.h to arch/arm/mach-at91
        arm: at91: move board header to arch/arm/mach-at91
        arn: at91: move at91_tc.h to arch/arm/mach-at91
        arm: at91 move at91_aic.h to arch/arm/mach-at91
        arm: at91 move board.h to arch/arm/mach-at91
        arm: at91: move platfarm_data to include/linux/platform_data/atmel.h
        arm: at91: drop machine defconfig
        ARM: OMAP: Remove NEED_MACH_GPIO_H
        ARM: OMAP: Remove unnecessary mach and plat includes
        ...
      8287361a