提交 · 0dd1c7bbce8d1d142bb25aefaa50262dfd77cb78 · openeuler / raspberrypi-kernel

22 1月, 2014 20 次提交

mm/rmap: extend rmap_walk_xxx() to cope with different cases · 0dd1c7bb

由 Joonsoo Kim 提交于 1月 21, 2014

There are a lot of common parts in traversing functions, but there are
also a little of uncommon parts in it.  By assigning proper function
pointer on each rmap_walker_control, we can handle these difference
correctly.

Following are differences we should handle.

1. difference of lock function in anon mapping case
2. nonlinear handling in file mapping case
3. prechecked condition:
	checking memcg in page_referenced(),
	checking VM_SHARE in page_mkclean()
	checking temporary vma in try_to_unmap()
4. exit condition:
	checking page_mapped() in try_to_unmap()

So, in this patch, I introduce 4 function pointers to handle above
differences.
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Hillf Danton <dhillf@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0dd1c7bb

mm/rmap: make rmap_walk to get the rmap_walk_control argument · 051ac83a

由 Joonsoo Kim 提交于 1月 21, 2014

In each rmap traverse case, there is some difference so that we need
function pointers and arguments to them in order to handle these

For this purpose, struct rmap_walk_control is introduced in this patch,
and will be extended in following patch.  Introducing and extending are
separate, because it clarify changes.
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Hillf Danton <dhillf@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

051ac83a

memblock, mem_hotplug: make memblock skip hotpluggable regions if needed · 55ac590c

由 Tang Chen 提交于 1月 21, 2014

Linux kernel cannot migrate pages used by the kernel.  As a result,
hotpluggable memory used by the kernel won't be able to be hot-removed.
To solve this problem, the basic idea is to prevent memblock from
allocating hotpluggable memory for the kernel at early time, and arrange
all hotpluggable memory in ACPI SRAT(System Resource Affinity Table) as
ZONE_MOVABLE when initializing zones.

In the previous patches, we have marked hotpluggable memory regions with
MEMBLOCK_HOTPLUG flag in memblock.memory.

In this patch, we make memblock skip these hotpluggable memory regions
in the default top-down allocation function if movable_node boot option
is specified.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Chen Tang <imtangchen@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Liu Jiang <jiang.liu@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

55ac590c

memblock: make memblock_set_node() support different memblock_type · e7e8de59

由 Tang Chen 提交于 1月 21, 2014

[sfr@canb.auug.org.au: fix powerpc build]
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Chen Tang <imtangchen@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Liu Jiang <jiang.liu@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e7e8de59

memblock, mem_hotplug: introduce MEMBLOCK_HOTPLUG flag to mark hotpluggable regions · 66b16edf

由 Tang Chen 提交于 1月 21, 2014

In find_hotpluggable_memory, once we find out a memory region which is
hotpluggable, we want to mark them in memblock.memory.  So that we could
control memblock allocator not to allocte hotpluggable memory for the
kernel later.

To achieve this goal, we introduce MEMBLOCK_HOTPLUG flag to indicate the
hotpluggable memory regions in memblock and a function
memblock_mark_hotplug() to mark hotpluggable memory if we find one.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Chen Tang <imtangchen@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Liu Jiang <jiang.liu@huawei.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

66b16edf

memblock, numa: introduce flags field into memblock · 66a20757

由 Tang Chen 提交于 1月 21, 2014

There is no flag in memblock to describe what type the memory is.
Sometimes, we may use memblock to reserve some memory for special usage.
And we want to know what kind of memory it is.  So we need a way to

In hotplug environment, we want to reserve hotpluggable memory so the
kernel won't be able to use it.  And when the system is up, we have to
free these hotpluggable memory to buddy.  So we need to mark these
memory first.

In order to do so, we need to mark out these special memory in memblock.
In this patch, we introduce a new "flags" member into memblock_region:

   struct memblock_region {
           phys_addr_t base;
           phys_addr_t size;
           unsigned long flags;		/* This is new. */
   #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
           int nid;
   #endif
   };

This patch does the following things:
1) Add "flags" member to memblock_region.
2) Modify the following APIs' prototype:
	memblock_add_region()
	memblock_insert_region()
3) Add memblock_reserve_region() to support reserve memory with flags, and keep
   memblock_reserve()'s prototype unmodified.
4) Modify other APIs to support flags, but keep their prototype unmodified.

The idea is from Wen Congyang <wency@cn.fujitsu.com> and Liu Jiang <jiang.liu@huawei.com>.
Suggested-by: NWen Congyang <wency@cn.fujitsu.com>
Suggested-by: NLiu Jiang <jiang.liu@huawei.com>
Signed-off-by: NTang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Rafael J . Wysocki" <rjw@sisk.pl>
Cc: Chen Tang <imtangchen@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

66a20757

mm: add overcommit_kbytes sysctl variable · 49f0ce5f

由 Jerome Marchand 提交于 1月 21, 2014

Some applications that run on HPC clusters are designed around the
availability of RAM and the overcommit ratio is fine tuned to get the
maximum usage of memory without swapping.  With growing memory, the
1%-of-all-RAM grain provided by overcommit_ratio has become too coarse
for these workload (on a 2TB machine it represents no less than 20GB).

This patch adds the new overcommit_kbytes sysctl variable that allow a
much finer grain.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix nommu build]
Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

49f0ce5f

mm, show_mem: remove SHOW_MEM_FILTER_PAGE_COUNT · aec6a888

由 Mel Gorman 提交于 1月 21, 2014

Commit 4b59e6c4 ("mm, show_mem: suppress page counts in
non-blockable contexts") introduced SHOW_MEM_FILTER_PAGE_COUNT to
suppress PFN walks on large memory machines.  Commit c78e9363 ("mm:
do not walk all of system memory during show_mem") avoided a PFN walk in
the generic show_mem helper which removes the requirement for
SHOW_MEM_FILTER_PAGE_COUNT in that case.

This patch removes PFN walkers from the arch-specific implementations
that report on a per-node or per-zone granularity.  ARM and unicore32
still do a PFN walk as they report memory usage on each bank which is a
much finer granularity where the debugging information may still be of
use.  As the remaining arches doing PFN walks have relatively small
amounts of memory, this patch simply removes SHOW_MEM_FILTER_PAGE_COUNT.

[akpm@linux-foundation.org: fix parisc]
Signed-off-by: NMel Gorman <mgorman@suse.de>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: James Bottomley <jejb@parisc-linux.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

aec6a888

mm, mempolicy: remove unneeded functions for UMA configs · d80be7c7

由 David Rientjes 提交于 1月 21, 2014

Mempolicies only exist for CONFIG_NUMA configurations.  Therefore, a
certain class of functions are unneeded in configurations where
CONFIG_NUMA is disabled such as functions that duplicate existing
mempolicies, lookup existing policies, set certain mempolicy traits, or
test mempolicies for certain attributes.

Remove the unneeded functions so that any future callers get a compile-
time error and protect their code with CONFIG_NUMA as required.
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d80be7c7

mm: create a separate slab for page->ptl allocation · b35f1819

由 Kirill A. Shutemov 提交于 1月 21, 2014

If DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC are enabled spinlock_t on x86_64
is 72 bytes.  For page->ptl they will be allocated from kmalloc-96 slab,
so we loose 24 on each.  An average system can easily allocate few tens
thousands of page->ptl and overhead is significant.

Let's create a separate slab for page->ptl allocation to solve this.

To make sure that it really works this time, some numbers from my test
machine (just booted, no load):

Before:
  # grep '^\(kmalloc-96\|page->ptl\)' /proc/slabinfo
  kmalloc-96         31987  32190    128   30    1 : tunables  120   60    8 : slabdata   1073   1073     92
After:
  # grep '^\(kmalloc-96\|page->ptl\)' /proc/slabinfo
  page->ptl          27516  28143     72   53    1 : tunables  120   60    8 : slabdata    531    531      9
  kmalloc-96          3853   5280    128   30    1 : tunables  120   60    8 : slabdata    176    176      0

Note that the patch is useful not only for debug case, but also for
PREEMPT_RT, where spinlock_t is always bloated.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b35f1819

mm: get rid of unnecessary pageblock scanning in setup_zone_migrate_reserve · 943dca1a

由 Yasuaki Ishimatsu 提交于 1月 21, 2014

Yasuaki Ishimatsu reported memory hot-add spent more than 5 _hours_ on
9TB memory machine since onlining memory sections is too slow.  And we
found out setup_zone_migrate_reserve spent >90% of the time.

The problem is, setup_zone_migrate_reserve scans all pageblocks
unconditionally, but it is only necessary if the number of reserved
block was reduced (i.e.  memory hot remove).

Moreover, maximum MIGRATE_RESERVE per zone is currently 2.  It means
that the number of reserved pageblocks is almost always unchanged.

This patch adds zone->nr_migrate_reserve_block to maintain the number of
MIGRATE_RESERVE pageblocks and it reduces the overhead of
setup_zone_migrate_reserve dramatically.  The following table shows time
of onlining a memory section.

  Amount of memory     | 128GB | 192GB | 256GB|
  ---------------------------------------------
  linux-3.12           |  23.9 |  31.4 | 44.5 |
  This patch           |   8.3 |   8.3 |  8.6 |
  Mel's proposal patch |  10.9 |  19.2 | 31.3 |
  ---------------------------------------------
                                   (millisecond)

  128GB : 4 nodes and each node has 32GB of memory
  192GB : 6 nodes and each node has 32GB of memory
  256GB : 8 nodes and each node has 32GB of memory

  (*1) Mel proposed his idea by the following threads.
       https://lkml.org/lkml/2013/10/30/272

[akpm@linux-foundation.org: tweak comment]
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reported-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Tested-by: NYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

943dca1a

mm: thp: turn compound_head() into BUG_ON(!PageTail) in get_huge_page_tail() · 5eaf1a9e

由 Oleg Nesterov 提交于 1月 21, 2014

get_huge_page_tail()->compound_head() looks confusing.  Every caller
must check PageTail(page), otherwise atomic_inc(&page->_mapcount) is
simply wrong if this page is compound-trans-head.
Signed-off-by: NOleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Jones <davej@redhat.com>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5eaf1a9e

mm: tail page refcounting optimization for slab and hugetlbfs · 44518d2b

由 Andrea Arcangeli 提交于 1月 21, 2014

This skips the _mapcount mangling for slab and hugetlbfs pages.

The main trouble in doing this is to guarantee that PageSlab and
PageHeadHuge remains constant for all get_page/put_page run on the tail
of slab or hugetlbfs compound pages.  Otherwise if they're set during
get_page but not set during put_page, the _mapcount of the tail page
would underflow.

PageHeadHuge will remain true until the compound page is released and
enters the buddy allocator so it won't risk to change even if the tail
page is the last reference left on the page.

PG_slab instead is cleared before the slab frees the head page with
put_page, so if the tail pin is released after the slab freed the page,
we would have a problem.  But in the slab case the tail pin cannot be
the last reference left on the page.  This is because the slab code is
free to reuse the compound page after a kfree/kmem_cache_free without
having to check if there's any tail pin left.  In turn all tail pins
must be always released while the head is still pinned by the slab code
and so we know PG_slab will be still set too.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Reviewed-by: NKhalid Aziz <khalid.aziz@oracle.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

44518d2b

mm: thp: optimize compound_trans_huge · ca641514

由 Andrea Arcangeli 提交于 1月 21, 2014

Currently we don't clobber page_tail->first_page during split_huge_page,
so compound_trans_head can be set to compound_head without adverse
effects, and this mostly optimizes away a smp_rmb.

It looks worthwhile to keep around the implementation that doesn't relay
on page_tail->first_page not to be clobbered, because it would be
necessary if we'll decide to enforce page->private to zero at all times
whenever PG_private is not set, also for anonymous pages.  For anonymous
pages enforcing such an invariant doesn't matter as anonymous pages
don't use page->private so we can get away with this microoptimization.
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ca641514

mm: hugetlbfs: Add some VM_BUG_ON()s to catch non-hugetlbfs pages · 0e147aed

由 Dave Hansen 提交于 1月 21, 2014

Dave Jiang reported that he was seeing oopses when running NUMA systems
and default_hugepagesz=1G.  I traced the issue down to
migrate_page_copy() trying to use the same code for hugetlb pages and
transparent hugepages.  It should not have been trying to pass thp pages
in there.

So, add some VM_BUG_ON()s for the next hapless VM developer that tries
the same thing.
Signed-off-by: NDave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Tested-by: NDave Jiang <dave.jiang@intel.com>
Acked-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0e147aed

mm: Make {,set}page_address() static inline if WANT_PAGE_VIRTUAL · f92f455f

由 Geert Uytterhoeven 提交于 1月 21, 2014

{,set}page_address() are macros if WANT_PAGE_VIRTUAL.  If
!WANT_PAGE_VIRTUAL, they're plain C functions.

If someone calls them with a void *, this pointer is auto-converted to
struct page * if !WANT_PAGE_VIRTUAL, but causes a build failure on
architectures using WANT_PAGE_VIRTUAL (arc, m68k and sparc64):

  drivers/md/bcache/bset.c: In function `__btree_sort':
  drivers/md/bcache/bset.c:1190: warning: dereferencing `void *' pointer
  drivers/md/bcache/bset.c:1190: error: request for member `virtual' in something not a structure or union

Convert them to static inline functions to fix this.  There are already
plenty of users of struct page members inside <linux/mm.h>, so there's
no reason to keep them as macros.
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Tested-by: NGuenter Roeck <linux@roeck-us.net>
Tested-by: NDavid Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f92f455f

posix_acl: uninlining · 0afaa120

由 Andrew Morton 提交于 1月 21, 2014

Uninline vast tracts of nested inline functions in
include/linux/posix_acl.h.

This reduces the text+data+bss size of x86_64 allyesconfig vmlinux by
8026 bytes.

The patch also regularises the positioning of the EXPORT_SYMBOLs in
posix_acl.c.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: NBenny Halevy <bhalevy@primarydata.com>
Cc: Benny Halevy <bhalevy@panasas.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0afaa120

fsnotify: remove .should_send_event callback · 83c4c4b0

由 Jan Kara 提交于 1月 21, 2014

After removing event structure creation from the generic layer there is
no reason for separate .should_send_event and .handle_event callbacks.
So just remove the first one.
Signed-off-by: NJan Kara <jack@suse.cz>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

83c4c4b0

fsnotify: do not share events between notification groups · 7053aee2

由 Jan Kara 提交于 1月 21, 2014

Currently fsnotify framework creates one event structure for each
notification event and links this event into all interested notification
groups. This is done so that we save memory when several notification
groups are interested in the event. However the need for event
structure shared between inotify & fanotify bloats the event structure
so the result is often higher memory consumption.

Another problem is that fsnotify framework keeps path references with
outstanding events so that fanotify can return open file descriptors
with its events. This has the undesirable effect that filesystem cannot
be unmounted while there are outstanding events - a regression for
inotify compared to a situation before it was converted to fsnotify
framework. For fanotify this problem is hard to avoid and users of
fanotify should kind of expect this behavior when they ask for file
descriptors from notified files.

This patch changes fsnotify and its users to create separate event
structure for each group. This allows for much simpler code (~400 lines
removed by this patch) and also smaller event structures. For example
on 64-bit system original struct fsnotify_event consumes 120 bytes, plus
additional space for file name, additional 24 bytes for second and each
subsequent group linking the event, and additional 32 bytes for each
inotify group for private data. After the conversion inotify event
consumes 48 bytes plus space for file name which is considerably less
memory unless file names are long and there are several groups
interested in the events (both of which are uncommon). Fanotify event
fits in 56 bytes after the conversion (fanotify doesn't care about file
names so its events don't have to have it allocated). A win unless
there are four or more fanotify groups interested in the event.

The conversion also solves the problem with unmount when only inotify is
used as we don't have to grab path references for inotify events.

[hughd@google.com: fanotify: fix corruption preventing startup]
Signed-off-by: NJan Kara <jack@suse.cz>
Reviewed-by: NChristoph Hellwig <hch@lst.de>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7053aee2

dma-debug: introduce debug_dma_assert_idle() · 0abdd7a8

由 Dan Williams 提交于 1月 21, 2014

Record actively mapped pages and provide an api for asserting a given
page is dma inactive before execution proceeds.  Placing
debug_dma_assert_idle() in cow_user_page() flagged the violation of the
dma-api in the NET_DMA implementation (see commit 77873803 "net_dma:
mark broken").

The implementation includes the capability to count, in a limited way,
repeat mappings of the same page that occur without an intervening
unmap.  This 'overlap' counter is limited to the few bits of tag space
in a radix tree.  This mechanism is added to mitigate false negative
cases where, for example, a page is dma mapped twice and
debug_dma_assert_idle() is called after the page is un-mapped once.
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0abdd7a8

21 1月, 2014 10 次提交

mfd: Cleanup mfd-mcp-sa11x0.h header · 7ff6d7a0

由 Sachin Kamat 提交于 12月 30, 2013

Commit a1fd844c ("ARM: sa1100: move platform_data definitions")
moved the file to the current location but forgot to remove the pointer
to its previous location. Clean it up. While at it also change the header
file protection macros appropriately.
Signed-off-by: NSachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

7ff6d7a0

mfd: Represent correct filenames in file headers · f876a975

由 Laszlo Papp 提交于 12月 19, 2013

The original author(s) probably copy/pasted these headers from the
existing public header files.
Signed-off-by: NLaszlo Papp <lpapp@kde.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

f876a975

mfd: mc13xxx: Remove duplicate mc13xxx_get_flags() declaration · 03b1e302

由 Alexander Shiyan 提交于 12月 14, 2013

mc13xxx_get_flags() declaration given twice.
This patch removes this duplicate.
Signed-off-by: NAlexander Shiyan <shc_work@mail.ru>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

03b1e302

mfd: ssbi: Constify buffer in ssbi_write · 5eec14cc

由 Stephen Boyd 提交于 12月 10, 2013

In preparation for passing a const pointer directly to
ssbi_write() from the regmap APIs.
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

5eec14cc

mfd: ssbi: Remove platform data structs and hide ssbi type enum · bae911a0

由 Stephen Boyd 提交于 12月 10, 2013

The ssbi driver assumes that the device is DT based. Remove the
platform data structs that will never be used and hide the enum
in the only C file that uses it.
Signed-off-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

bae911a0

mfd: tps6586x: Add version detection · e0a3da80

由 Stefan Agner 提交于 12月 06, 2013

Use the VERSIONCRC to determine the exact device version. According to
the datasheet this register can be used as device identifier. The
identification is needed since some tps6586x regulators use a different
voltage table.
Signed-off-by: NStefan Agner <stefan@agner.ch>
Reviewed-by: NThierry Reding <treding@nvidia.com>
Acked-by: NStephen Warren <swarren@nvidia.com>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

e0a3da80

mfd: Add LP3943 MFD driver · 470eca47

由 Milo Kim 提交于 12月 06, 2013

LP3943 has 16 output pins which can be used as GPIO expander and PWM generator.

* Regmap I2C interface for R/W LP3943 registers

* Atomic operations for output pin assignment
  The driver should check whether requested pin is available or not.
  If the pin is already used, pin request returns as a failure.
  A driver data, 'pin_used' is checked when gpio_request() and
  pwm_request() are called. If the pin is available, then pin_used is set.
  And it is cleared when gpio_free() and pwm_free().

* Device tree support
  Compatible strings for GPIO and PWM driver.
  LP3943 platform data is PWM related, so parsing the device tree is
  implemented in the PWM driver.
Signed-off-by: NMilo Kim <milo.kim@ti.com>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

470eca47

mfd/pinctrl: Delete platform data header · 97b583f3

由 Linus Walleij 提交于 12月 03, 2013

This deletes the special AB8500 GPIO platform data passing
header and merges the few remaining contents down into the
abx500 pinctrl driver which handles the abx500 GPIO device.
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

97b583f3

mfd: ab8500: Delete all GPIO platform data instances · 45e84223

由 Linus Walleij 提交于 12月 03, 2013

This deletes all instances where the AB8500 GPIO platform
data is passed around. It is completely unused in the kernel
now, so it does not hurt anyone.
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

45e84223

mfd: max14577: Add max14577 MFD driver core · 3008ddbe

由 Chanwoo Choi 提交于 11月 22, 2013

This patch adds max14577 core/irq driver to support MUIC(Micro USB IC)
device and charger device and support irq domain method to control
internal interrupt of max14577 device. Also, this patch supports DT
binding with max14577_i2c_parse_dt().

The MAXIM 14577 chip contains Micro-USB Interface Circuit and Li+ Battery
Charger. It contains accessory and USB charger detection logic. It supports
USB 2.0 Hi-Speed, UART and stereo audio signals over Micro-USB connector.

The battery charger is compliant with the USB Battery Charging Specification
Revision 1.1. It has also SFOUT LDO output for powering USB devices.
Reviewed-by: NMark Brown <broonie@linaro.org>
Signed-off-by: NChanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

3008ddbe

18 1月, 2014 1 次提交

kernfs: add struct dentry declaration in kernfs.h · 917f56ca

由 Tejun Heo 提交于 1月 17, 2014

Hello, Greg.

Two misc fixes for kernfs.

Thanks.
------- 8< -------
struct dentry is used in kernfs.h but its declaration was missing,
leading to compilation errors unless its declaration gets pulled in in
some other way.  Add the declaration.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

917f56ca

16 1月, 2014 2 次提交

pinctrl: Add void * to pinctrl_pin_desc · a30d5421

由 Sherman Yin 提交于 12月 20, 2013

drv_data is added to the pinctrl_pin_desc for drivers to define additional
driver-specific per-pin data.
Signed-off-by: NSherman Yin <syin@broadcom.com>
Reviewed-by: NChristian Daudt <bcm@fixthebug.org>
Reviewed-by: NMatt Porter <matt.porter@linaro.org>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

a30d5421

i2c: pnx: Use devm_*() functions · d1ccc125

由 Jingoo Han 提交于 1月 14, 2014

Use devm_*() functions to make cleanup paths simpler,
and remove redundant return value check of platform_get_resource()
because the value is checked by devm_ioremap_resource().
Signed-off-by: NJingoo Han <jg1.han@samsung.com>
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>

d1ccc125

15 1月, 2014 6 次提交

crash_dump: fix compilation error (on MIPS at least) · 5a610fcc

由 Qais Yousef 提交于 1月 14, 2014

  In file included from kernel/crash_dump.c:2:0:
  include/linux/crash_dump.h:22:27: error: unknown type name `pgprot_t'

when CONFIG_CRASH_DUMP=y

The error was traced back to commit 9cb21813 ("vmcore: introduce
remap_oldmem_pfn_range()")

include <asm/pgtable.h> to get the missing definition
Signed-off-by: NQais Yousef <qais.yousef@imgtec.com>
Reviewed-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: NVivek Goyal <vgoyal@redhat.com>
Cc: <stable@vger.kernel.org>	[3.12+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5a610fcc

hwmon: (sht15) add include guard · ffcc3b2a

由 Vivien Didelot 提交于 1月 10, 2014

Add include guard to include/linux/platform_data/sht15.h to prevent
multiple inclusion.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>

ffcc3b2a

hwmon: (max197) add include guard · 4cb6409b

由 Vivien Didelot 提交于 1月 10, 2014

Add include guard to include/linux/platform_data/max197.h to prevent
multiple inclusion.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>

4cb6409b

hwmon: (s3c) Trivial cleanup in hwmon-s3c.h · 2ac1dfc5

由 Sachin Kamat 提交于 12月 27, 2013

Commit 436d42c6 ("ARM: samsung: move platform_data definitions")
moved the file to the current location but forgot to remove the pointer
to its previous location. Clean it up. While at it also change the header
file protection macros appropriately.
Signed-off-by: NSachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>

2ac1dfc5

dma: Indicate residue granularity in dma_slave_caps · 50720563

由 Lars-Peter Clausen 提交于 1月 11, 2014

This patch adds a new field to the dma_slave_caps struct which indicates the
granularity with which the driver is able to update the residue field of the
dma_tx_state struct. Making this information available to dmaengine users allows
them to make better decisions on how to operate. E.g. for audio certain features
like wakeup less operation or timer based scheduling only make sense and work
correctly if the reported residue is fine-grained enough.

Right now four different levels of granularity are supported:
	* DESCRIPTOR: The DMA channel is only able to tell whether a descriptor has
	  been completed or not, which means residue reporting is not supported by
	  this channel. The residue field of the dma_tx_state field will always be
	  0.
	* SEGMENT: The DMA channel updates the residue field after each successfully
	  completed segment of the transfer (For cyclic transfers this is after each
	  period). This is typically implemented by having the hardware generate an
	  interrupt after each transferred segment and then the drivers updates the
	  outstanding residue by the size of the segment. Another possibility is if
	  the hardware supports SG and the segment descriptor has a field which gets
	  set after the segment has been completed. The driver then counts the
	  number of segments without the flag set to compute the residue.
	* BURST: The DMA channel updates the residue field after each transferred
	  burst. This is typically only supported if the hardware has a progress
	  register of some sort (E.g. a register with the current read/write address
	  or a register with the amount of bursts/beats/bytes that have been
	  transferred or still need to be transferred).
Signed-off-by: NLars-Peter Clausen <lars@metafoo.de>
Acked-by: NVinod Koul <vinod.koul@intel.com>
Signed-off-by: NMark Brown <broonie@linaro.org>

50720563

i2c: Re-instate body of i2c_parent_is_i2c_adapter() · 2fac2b89

由 Stephen Warren 提交于 1月 13, 2014

The body of i2c_parent_is_i2c_adapter() is currently guarded by
I2C_MUX. It should be CONFIG_I2C_MUX instead.

Among potentially other problems, this resulted in i2c_lock_adapter()
only locking I2C mux child adapters, and not the parent adapter. In
turn, this could allow inter-mingling of mux child selection and I2C
transactions, which could result in I2C transactions being directed to
the wrong I2C bus, and possibly even switching between busses in the
middle of a transaction.

One concrete issue caused by this bug was corrupted HDMI EDID reads
during boot on the NVIDIA Tegra Seaboard system, although this only
became apparent in recent linux-next, when the boot timing was changed
just enough to trigger the race condition.

Fixes: 3923172b ("i2c: reduce parent checking to a NOOP in non-I2C_MUX case")
Cc: Phil Carmody <phil.carmody@partner.samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NStephen Warren <swarren@nvidia.com>
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>

2fac2b89

14 1月, 2014 1 次提交

ARM: S3C[24|64]xx: move includes back under <mach/> scope · b0161caa

由 Linus Walleij 提交于 1月 14, 2014

When refactoring and breaking out the includes for the
machine-specific GPIO configuration, two files were created
in <linux/platform_data/gpio-samsung-s3c[24|64]xx.h>, but as
that namespace shall be used for defining data exchanged
between machines and drivers, using it for these broad macros
and config settings is wrong.

Move the headers back into the machine-local
<mach/gpio-samsung.h> file and think about the next step.
Reported-by: NArnd Bergmann <arnd@arndb.de>
Cc: Tomasz Figa <tomasz.figa@gmail.com>
Cc: Sylwester Nawrocki <sylvester.nawrocki@gmail.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: linux-samsung-soc@vger.kernel.org
Acked-by: NMark Brown <broonie@linaro.org>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NHeiko Stuebner <heiko@sntech.de>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

b0161caa