提交 · 6911a5281430cf6897376487698504620f454791 · openanolis / cloud-kernel

30 7月, 2016 1 次提交

sparc64: Trim page tables for 8M hugepages · 7bc3777c

由 Nitin Gupta 提交于 7月 29, 2016

For PMD aligned (8M) hugepages, we currently allocate
all four page table levels which is wasteful. We now
allocate till PMD level only which saves memory usage
from page tables.

Also, when freeing page table for 8M hugepage backed region,
make sure we don't try to access non-existent PTE level.

Orabug: 22630259
Signed-off-by: NNitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7bc3777c

29 7月, 2016 10 次提交

x86/power/64: Fix hibernation return address corruption · 4ce827b4

由 Josh Poimboeuf 提交于 7月 28, 2016

In kernel bug 150021, a kernel panic was reported when restoring a
hibernate image.  Only a picture of the oops was reported, so I can't
paste the whole thing here.  But here are the most interesting parts:

  kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
  BUG: unable to handle kernel paging request at ffff8804615cfd78
  ...
  RIP: ffff8804615cfd78
  RSP: ffff8804615f0000
  RBP: ffff8804615cfdc0
  ...
  Call Trace:
   do_signal+0x23
   exit_to_usermode_loop+0x64
   ...

The RIP is on the same page as RBP, so it apparently started executing
on the stack.

The bug was bisected to commit ef0f3ed5 (x86/asm/power: Create
stack frames in hibernate_asm_64.S), which in retrospect seems quite
dangerous, since that code saves and restores the stack pointer from a
global variable ('saved_context').

There are a lot of moving parts in the hibernate save and restore paths,
so I don't know exactly what caused the panic.  Presumably, a FRAME_END
was executed without the corresponding FRAME_BEGIN, or vice versa.  That
would corrupt the return address on the stack and would be consistent
with the details of the above panic.

[ rjw: One major problem is that by the time the FRAME_BEGIN in
  restore_registers() is executed, the stack pointer value may not
  be valid any more.  Namely, the stack area pointed to by it
  previously may have been overwritten by some image memory contents
  and that page frame may now be used for whatever different purpose
  it had been allocated for before hibernation.  In that case, the
  FRAME_BEGIN will corrupt that memory. ]

Instead of doing the frame pointer save/restore around the bounds of the
affected functions, just do it around the call to swsusp_save().

That has the same effect of ensuring that if swsusp_save() sleeps, the
frame pointers will be correct.  It's also a much more obviously safe
way to do it than the original patch.  And objtool still doesn't report
any warnings.

Fixes: ef0f3ed5 (x86/asm/power: Create stack frames in hibernate_asm_64.S)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=150021
Cc: 4.6+ <stable@vger.kernel.org> # 4.6+
Reported-by: NAndre Reinke <andre.reinke@mailbox.org>
Tested-by: NAndre Reinke <andre.reinke@mailbox.org>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Acked-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

4ce827b4

avr32: off by one in at32_init_pio() · 55f1cf83

由 Dan Carpenter 提交于 7月 13, 2016

The pio_dev[] array has MAX_NR_PIO_DEVICES elements so the > should be
>=.

Fixes: 5f97f7f9 ('[PATCH] avr32 architecture')
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>

55f1cf83

avr32: fixup code style in unistd.h and syscall_table.S · 6ad4a21b

由 Hans-Christian Noren Egtvedt 提交于 5月 29, 2016

This patch swaps the mix of tabs and space for alignment of comment
after code to use spaces only.

Also document why recvmmsg was defined twice in the syscall_table.S
table, but only once in unistd.h. In short, wired in the table by
generic arch patch, but forgotten in unistd.h (review slip).

6ad4a21b

avr32: wire up preadv2 and pwritev2 syscalls · 389ce5a9

由 Hans-Christian Noren Egtvedt 提交于 5月 29, 2016

This patch wires up the new preadv2 and pwritev2 syscall on AVR32.

On AVR32, all parameters beyond the 5th are passed on the stack. System
calls don't use the stack -- they borrow a callee-saved register
instead. This means that syscalls that take 6 parameters must be called
through a stub that pushes the last parameter on the stack.
Signed-off-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no>

389ce5a9

sparc64 mm: Fix base TSB sizing when hugetlb pages are used · af1b1a9b

由 Mike Kravetz 提交于 7月 15, 2016

do_sparc64_fault() calculates both the base and huge page RSS sizes and
uses this information in calls to tsb_grow(). The calculation for base
page TSB size is not correct if the task uses hugetlb pages. hugetlb
pages are not accounted for in RSS, therefore the call to get_mm_rss(mm)
does not include hugetlb pages. However, the number of pages based on
huge_pte_count (which does include hugetlb pages) is subtracted from
this value. This will result in an artificially small and often negative
RSS calculation. The base TSB size is then often set to max_tsb_size
as the passed RSS is unsigned, so a negative value looks really big.

THP pages are also accounted for in huge_pte_count, and THP pages are
accounted for in RSS so the calculation in do_sparc64_fault() is correct
if a task only uses THP pages.

A single huge_pte_count is not sufficient for TSB sizing if both hugetlb
and THP pages can be used. Instead of a single counter, use two: one
for hugetlb and one for THP.
Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

af1b1a9b

arm64:acpi: fix the acpi alignment exception when 'mem=' specified · cb0a6502

由 Dennis Chen 提交于 7月 28, 2016

When booting an ACPI enabled kernel with 'mem=x', there is the
possibility that ACPI data regions from the firmware will lie above the
memory limit.  Ordinarily these will be removed by
memblock_enforce_memory_limit(.).

Unfortunately, this means that these regions will then be mapped by
acpi_os_ioremap(.) as device memory (instead of normal) thus unaligned
accessess will then provoke alignment faults.

In this patch we adopt memblock_mem_limit_remove_map instead, and this
preserves these ACPI data regions (marked NOMAP) thus ensuring that
these regions are not mapped as device memory.

For example, below is an alignment exception observed on ARM platform
when booting the kernel with 'acpi=on mem=8G':

  ...
  Unable to handle kernel paging request at virtual address ffff0000080521e7
  pgd = ffff000008aa0000
  [ffff0000080521e7] *pgd=000000801fffe003, *pud=000000801fffd003, *pmd=000000801fffc003, *pte=00e80083ff1c1707
  Internal error: Oops: 96000021 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc3-next-20160616+ #172
  Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1001A 02/09/2016
  task: ffff800001ef0000 ti: ffff800001ef8000 task.ti: ffff800001ef8000
  PC is at acpi_ns_lookup+0x520/0x734
  LR is at acpi_ns_lookup+0x4a4/0x734
  pc : [<ffff0000083b8b10>] lr : [<ffff0000083b8a94>] pstate: 60000045
  sp : ffff800001efb8b0
  x29: ffff800001efb8c0 x28: 000000000000001b
  x27: 0000000000000001 x26: 0000000000000000
  x25: ffff800001efb9e8 x24: ffff000008a10000
  x23: 0000000000000001 x22: 0000000000000001
  x21: ffff000008724000 x20: 000000000000001b
  x19: ffff0000080521e7 x18: 000000000000000d
  x17: 00000000000038ff x16: 0000000000000002
  x15: 0000000000000007 x14: 0000000000007fff
  x13: ffffff0000000000 x12: 0000000000000018
  x11: 000000001fffd200 x10: 00000000ffffff76
  x9 : 000000000000005f x8 : ffff000008725fa8
  x7 : ffff000008a8df70 x6 : ffff000008a8df70
  x5 : ffff000008a8d000 x4 : 0000000000000010
  x3 : 0000000000000010 x2 : 000000000000000c
  x1 : 0000000000000006 x0 : 0000000000000000
  ...
    acpi_ns_lookup+0x520/0x734
    acpi_ds_load1_begin_op+0x174/0x4fc
    acpi_ps_build_named_op+0xf8/0x220
    acpi_ps_create_op+0x208/0x33c
    acpi_ps_parse_loop+0x204/0x838
    acpi_ps_parse_aml+0x1bc/0x42c
    acpi_ns_one_complete_parse+0x1e8/0x22c
    acpi_ns_parse_table+0x8c/0x128
    acpi_ns_load_table+0xc0/0x1e8
    acpi_tb_load_namespace+0xf8/0x2e8
    acpi_load_tables+0x7c/0x110
    acpi_init+0x90/0x2c0
    do_one_initcall+0x38/0x12c
    kernel_init_freeable+0x148/0x1ec
    kernel_init+0x10/0xec
    ret_from_fork+0x10/0x40
  Code: b9009fbc 2a00037b 36380057 3219037b (b9400260)
  ---[ end trace 03381e5eb0a24de4 ]---
  Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

With 'efi=debug', we can see those ACPI regions loaded by firmware on
that board as:

  efi:   0x0083ff185000-0x0083ff1b4fff [Reserved           |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
  efi:   0x0083ff1b5000-0x0083ff1c2fff [ACPI Reclaim Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
  efi:   0x0083ff223000-0x0083ff224fff [ACPI Memory NVS    |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*

Link: http://lkml.kernel.org/r/1468475036-5852-3-git-send-email-dennis.chen@arm.comAcked-by: NSteve Capper <steve.capper@arm.com>
Signed-off-by: NDennis Chen <dennis.chen@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Kaly Xin <kaly.xin@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cb0a6502

mm: move most file-based accounting to the node · 11fb9989

由 Mel Gorman 提交于 7月 28, 2016

There are now a number of accounting oddities such as mapped file pages
being accounted for on the node while the total number of file pages are
accounted on the zone.  This can be coped with to some extent but it's
confusing so this patch moves the relevant file-based accounted.  Due to
throttling logic in the page allocator for reliable OOM detection, it is
still necessary to track dirty and writeback pages on a per-zone basis.

[mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting]
  Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net
Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

11fb9989

mm: move page mapped accounting to the node · 50658e2e

由 Mel Gorman 提交于 7月 28, 2016

Reclaim makes decisions based on the number of pages that are mapped but
it's mixing node and zone information.  Account NR_FILE_MAPPED and
NR_ANON_PAGES pages on the node.

Link: http://lkml.kernel.org/r/1467970510-21195-18-git-send-email-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

50658e2e

mm, vmscan: move LRU lists to node · 599d0c95

由 Mel Gorman 提交于 7月 28, 2016

This moves the LRU lists from the zone to the node and related data such
as counters, tracing, congestion tracking and writeback tracking.

Unfortunately, due to reclaim and compaction retry logic, it is
necessary to account for the number of LRU pages on both zone and node
logic. Most reclaim logic is based on the node counters but the retry
logic uses the zone counters which do not distinguish inactive and
active sizes. It would be possible to leave the LRU counters on a
per-zone basis but it's a heavier calculation across multiple cache
lines that is much more frequent than the retry checks.

Other than the LRU counters, this is mostly a mechanical patch but note
that it introduces a number of anomalies. For example, the scans are
per-zone but using per-node counters. We also mark a node as congested
when a zone is congested. This causes weird problems that are fixed
later but is easier to review.

In the event that there is excessive overhead on 32-bit systems due to
the nodes being on LRU then there are two potential solutions

1. Long-term isolation of highmem pages when reclaim is lowmem

When pages are skipped, they are immediately added back onto the LRU
list. If lowmem reclaim persisted for long periods of time, the same
highmem pages get continually scanned. The idea would be that lowmem
keeps those pages on a separate list until a reclaim for highmem pages
arrives that splices the highmem pages back onto the LRU. It potentially
could be implemented similar to the UNEVICTABLE list.

That would reduce the skip rate with the potential corner case is that
highmem pages have to be scanned and reclaimed to free lowmem slab pages.

2. Linear scan lowmem pages if the initial LRU shrink fails

This will break LRU ordering but may be preferable and faster during
memory pressure than skipping LRU pages.

Link: http://lkml.kernel.org/r/1467970510-21195-4-git-send-email-mgorman@techsingularity.netSigned-off-by: NMel Gorman <mgorman@techsingularity.net>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

599d0c95

ARC: mm: don't loose PTE_SPECIAL in pte_modify() · 3925a16a

由 Vineet Gupta 提交于 7月 28, 2016

LTP madvise05 was generating mm splat

| [ARCLinux]# /sd/ltp/testcases/bin/madvise05
| BUG: Bad page map in process madvise05  pte:80e08211 pmd:9f7d4000
| page:9fdcfc90 count:1 mapcount:-1 mapping:  (null) index:0x0 flags: 0x404(referenced|reserved)
| page dumped because: bad pte
| addr:200b8000 vm_flags:00000070 anon_vma:  (null) mapping:  (null) index:1005c
| file:  (null) fault:  (null) mmap:  (null) readpage:  (null)
| CPU: 2 PID: 6707 Comm: madvise05

And for newer kernels, the system was rendered unusable afterwards.

The problem was mprotect->pte_modify() clearing PTE_SPECIAL (which is
set to identify the special zero page wired to the pte).
When pte was finally unmapped, special casing for zero page was not
done, and instead it was treated as a "normal" page, tripping on the
map counts etc.

This fixes ARC STAR 9001053308

Cc: <stable@vger.kernel.org>
Signed-off-by: NVineet Gupta <vgupta@synopsys.com>

3925a16a

28 7月, 2016 4 次提交

sparc32: off by ones in BUG_ON() · fa160828

由 Dan Carpenter 提交于 7月 15, 2016

Smatch complains that these tests are off by one, which is true but not
life threatening.

	arch/sparc/kernel/irq_32.c:169 irq_link()
	error: buffer overflow 'irq_map' 384 <= 384
Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fa160828

powerpc/mm: Parenthesise IS_ENABLED() in if condition · fbef66f0

由 Stephen Rothwell 提交于 7月 28, 2016

Currently IS_ENABLED() produces an expression surrounded by parentheses,
which allows this code to compile, generating eg:

    else if (1 || 0)
        hpte_init_native();

However a change to the macro in the kbuild tree will break this in
future by removing the parentheses.

Fixes: 7353644f ("powerpc/mm: Fix build break when PPC_NATIVE=n")
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

fbef66f0

sparc: Don't leak context bits into thread->fault_address · 4f6deb8c

由 David S. Miller 提交于 7月 27, 2016

On pre-Niagara systems, we fetch the fault address on data TLB
exceptions from the TLB_TAG_ACCESS register.  But this register also
contains the context ID assosciated with the fault in the low 13 bits
of the register value.

This propagates into current_thread_info()->fault_address and can
cause trouble later on.

So clear the low 13-bits out of the TLB_TAG_ACCESS value in the cases
where it matters.
Reported-by: NMikulas Patocka <mpatocka@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f6deb8c

Disable "maybe-uninitialized" warning globally · 6e8d666e

由 Linus Torvalds 提交于 7月 27, 2016

Several build configurations had already disabled this warning because
it generates a lot of false positives. But some had not, and it was
still enabled for "allmodconfig" builds, for example.

Looking at the warnings produced, every single one I looked at was a
false positive, and the warnings are frequent enough (and big enough)
that they can easily hide real problems that you don't notice in the
noise generated by -Wmaybe-uninitialized.

The warning is good in theory, but this is a classic case of a warning
that causes more problems than the warning can solve.

If gcc gets better at avoiding false positives, we may be able to
re-enable this warning. But as is, we're better off without it, and I
want to be able to see the *real* warnings.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6e8d666e

27 7月, 2016 14 次提交

x86/asm, x86/microcode: Add __PAGE_OFFSET_BASE define on 32-bit · 4a1a8e1b

由 Borislav Petkov 提交于 7月 27, 2016

... in order to avoid #ifdeffery in code computing the ASLR randomization
offset. Remove that #ifdeffery in the microcode loader.
Suggested-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nicolai Stange <nicstange@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160727120939.GA18911@nazgul.tnicSigned-off-by: NIngo Molnar <mingo@kernel.org>

4a1a8e1b

arm64: arm: Fix-up the removal of the arm64 regs_query_register_name() prototype · fd6380b7

由 Catalin Marinas 提交于 7月 27, 2016

Commit 0a8ea52c ("arm64: Add HAVE_REGS_AND_STACK_ACCESS_API
feature") inadvertently removed the arch/arm prototype instead of the
arm64 one introduced by the original patch. There should not be any
bisection issues since this function is not called from anywhere else
(it could as well be removed from arch/arm at some point).

Fixes: 0a8ea52c ("arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature")
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

fd6380b7

arm64: Only select ARM64_MODULE_PLTS if MODULES=y · b9c220b5

由 Catalin Marinas 提交于 7月 26, 2016

Selecting CONFIG_RANDOMIZE_BASE=y and CONFIG_MODULES=n fails to build
the module PLTs support:

  CC      arch/arm64/kernel/module-plts.o
/work/Linux/linux-2.6-aarch64/arch/arm64/kernel/module-plts.c: In function ‘module_emit_plt_entry’:
/work/Linux/linux-2.6-aarch64/arch/arm64/kernel/module-plts.c:32:49: error: dereferencing pointer to incomplete type ‘struct module’

This patch selects ARM64_MODULE_PLTS conditionally only if MODULES is
enabled.

Fixes: f80fb3a3 ("arm64: add support for kernel ASLR")
Cc: <stable@vger.kernel.org> # 4.6+
Reported-by: NJeff Vander Stoep <jeffv@google.com>
Acked-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

b9c220b5

mm: do not pass mm_struct into handle_mm_fault · dcddffd4

由 Kirill A. Shutemov 提交于 7月 26, 2016

We always have vma->vm_mm around.

Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.comSigned-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

dcddffd4

arch: x86: charge page tables to kmemcg · 3e79ec7d

由 Vladimir Davydov 提交于 7月 26, 2016

Page tables can bite a relatively big chunk off system memory and their
allocations are easy to trigger from userspace, so they should be
accounted to kmemcg.

This patch marks page table allocations as __GFP_ACCOUNT for x86.  Note
we must not charge allocations of kernel page tables, because they can
be shared among processes from different cgroups so accounting them to a
particular one can pin other cgroups for indefinitely long.  So we clear
__GFP_ACCOUNT flag if a page table is allocated for the kernel.

Link: http://lkml.kernel.org/r/7d5c54f6a2bcbe76f03171689440003d87e6c742.1464079538.git.vdavydov@virtuozzo.comSigned-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3e79ec7d

mm/mmu_gather: track page size with mmu gather and force flush if page size change · e77b0852

由 Aneesh Kumar K.V 提交于 7月 26, 2016

This allows an arch which needs to do special handing with respect to
different page size when flushing tlb to implement the same in mmu
gather.

Link: http://lkml.kernel.org/r/1465049193-22197-3-git-send-email-aneesh.kumar@linux.vnet.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e77b0852

mm: change the interface for __tlb_remove_page() · e9d55e15

由 Aneesh Kumar K.V 提交于 7月 26, 2016

This updates the generic and arch specific implementation to return true
if we need to do a tlb flush. That means if a __tlb_remove_page
indicate a flush is needed, the page we try to remove need to be tracked
and added again after the flush. We need to track it because we have
already update the pte to none and we can't just loop back.

This change is done to enable us to do a tlb_flush when we try to flush
a range that consists of different page sizes. For architectures like
ppc64, we can do a range based tlb flush and we need to track page size
for that. When we try to remove a huge page, we will force a tlb flush
and starts a new mmu gather.

[aneesh.kumar@linux.vnet.ibm.com: mm-change-the-interface-for-__tlb_remove_page-v3]
Link: http://lkml.kernel.org/r/1465049193-22197-2-git-send-email-aneesh.kumar@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1464860389-29019-2-git-send-email-aneesh.kumar@linux.vnet.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e9d55e15

powerpc/mm: check for irq disabled() only if DEBUG_VM is enabled · 9af3f56b

由 Aneesh Kumar K.V 提交于 7月 26, 2016

We don't need to check this always. The idea here is to capture the
wrong usage of find_linux_pte_or_hugepte and we can do that by
occasionally running with DEBUG_VM enabled.

Link: http://lkml.kernel.org/r/1464692688-6612-2-git-send-email-aneesh.kumar@linux.vnet.ibm.comSigned-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: NAnshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9af3f56b

m32r: add __ucmpdi2 to fix build failure · a44ce523

由 Sudip Mukherjee 提交于 7月 26, 2016

We are having build failure with m32r and the error message being:

  ERROR: "__ucmpdi2" [lib/842/842_decompress.ko] undefined!
  ERROR: "__ucmpdi2" [fs/btrfs/btrfs.ko] undefined!
  ERROR: "__ucmpdi2" [drivers/scsi/sd_mod.ko] undefined!
  ERROR: "__ucmpdi2" [drivers/media/i2c/adv7842.ko] undefined!
  ERROR: "__ucmpdi2" [drivers/md/bcache/bcache.ko] undefined!
  ERROR: "__ucmpdi2" [drivers/iio/imu/inv_mpu6050/inv-mpu6050.ko] undefined!

__ucmpdi2 is introduced to m32r architecture taking example from other
architectures like h8300, microblaze, mips.

Link: http://lkml.kernel.org/r/1465509213-4280-1-git-send-email-sudipm.mukherjee@gmail.comSigned-off-by: NSudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a44ce523

kbuild: abort build on bad stack protector flag · c965b105

由 Kees Cook 提交于 7月 26, 2016

Before, the stack protector flag was sanity checked before .config had
been reprocessed.  This meant the build couldn't be aborted early, and
only a warning could be emitted followed later by the compiler blowing
up with an unknown flag.  This has caused a lot of confusion over time,
so this splits the flag selection from sanity checking and performs the
sanity checking after the make has been restarted from a reprocessed
.config, so builds can be aborted as early as possible now.

Additionally moves the x86-specific sanity check to the same location,
since it suffered from the same warn-then-wait-for-compiler-failure
problem.

Link: http://lkml.kernel.org/r/20160712223043.GA11664@www.outflux.netSigned-off-by: NKees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c965b105

arm: get rid of superfluous __GFP_REPEAT · 397b080b

由 Michal Hocko 提交于 7月 26, 2016

__GFP_REPEAT has a rather weak semantic but since it has been introduced
around 2.6.12 it has been ignored for low order allocations.

PGALLOC_GFP uses __GFP_REPEAT but none of the allocation which uses this
flag is for more than order-2.  This means that this flag has never been
actually useful here because it has always been used only for
PAGE_ALLOC_COSTLY requests.

Link: http://lkml.kernel.org/r/1464599699-30131-5-git-send-email-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

397b080b

xtensa: Partially Revert "xtensa: Remove unnecessary of_platform_populate with default match table" · e973f4ec

由 Rob Herring 提交于 7月 26, 2016

This partially reverts commit 69d99e6c keeping only the main
purpose of the original commit which is the removal of
of_platform_populate() call. The moving of of_clk_init() caused changes
in the initialization order breaking booting.

Fixes: 69d99e6c ("xtensa: Remove unnecessary of_platform_populate with default match table")
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Tested-by: NMax Filippov <jcmvbkbc@gmail.com>
Signed-off-by: NRob Herring <robh@kernel.org>

e973f4ec

x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y · efaad554

由 Borislav Petkov 提交于 7月 26, 2016

CONFIG_RANDOMIZE_MEMORY=y randomizes the physical memmap and thus the
address where the initrd is located. Therefore, we need to add the
offset KASLR put us to in order to find the initrd again on the AP path.

In the future, we will get rid of the initrd address caching and query
the address on both the BSP and AP paths but that would need more work.

Thanks to Nicolai Stange for the good bisection and debugging work.
Reported-and-tested-by: NNicolai Stange <nicstange@gmail.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160726095138.3470-1-bp@alien8.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

efaad554

xtensa: Fix build error due to missing include file · d02014b2

由 Guenter Roeck 提交于 7月 23, 2016

Commit 69d99e6c ("xtensa: Remove unnecessary of_platform_populate
with default match table") dropped various include files from
arch/xtensa/kernel/setup.c. This results in the following build error.

arch/xtensa/kernel/setup.c: In function ‘xtensa_dt_io_area’:
arch/xtensa/kernel/setup.c:213:2: error:
	implicit declaration of function ‘of_read_ulong’

Fixes: 69d99e6c ("xtensa: Remove unnecessary of_platform_populate with default match table")
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Acked-by: NMax Filippov <jcmvbkbc@gmail.com>
Signed-off-by: NRob Herring <robh@kernel.org>

d02014b2

26 7月, 2016 9 次提交

xen: add static initialization of steal_clock op to xen_time_ops · d34c30cc

由 Juergen Gross 提交于 7月 26, 2016

pv_time_ops might be overwritten with xen_time_ops after the
steal_clock operation has been initialized already. To prevent calling
a now uninitialized function pointer add the steal_clock static
initialization to xen_time_ops.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>

d34c30cc

dtb: xgene: Add MDIO node · 8e694cd2

由 Iyappan Subramanian 提交于 7月 25, 2016

Added mdio node for mdio driver.  Also added phy-handle
reference to the ethernet nodes.

Removed unused clock node from storm sgenet1.
Signed-off-by: NIyappan Subramanian <isubramanian@apm.com>
Tested-by: NFushen Chen <fchen@apm.com>
Tested-by: NToan Le <toanle@apm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e694cd2

powerpc: Improve comment explaining why we modify VRSAVE · dd570237

由 Anton Blanchard 提交于 5月 20, 2016

The comment explaining why we modify VRSAVE is misleading, glibc
does rely on the behaviour. Update the comment.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Reviewed-by: NCyril Bur <cyrilbur@gmail.com>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

dd570237

powerpc/mm: Drop unused externs for hpte_init_beat[_v3]() · 1a1cee84

由 Michael Ellerman 提交于 7月 25, 2016

We removed the BEAT support in 2015 in commit bf4981a0 ("powerpc:
Remove the celleb support"). These externs are unused since then.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

1a1cee84

powerpc/mm: Rename hpte_init_lpar() and move the fallback to a header · 6364e84e

由 Michael Ellerman 提交于 7月 26, 2016

hpte_init_lpar() is part of the pseries platform, so name it as such.

Move the fallback implementation for when PSERIES=n into the header,
dropping the weak implementation. The panic() is now handled by the
calling code.
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

6364e84e

powerpc/mm: Fix build break when PPC_NATIVE=n · 7353644f

由 Michael Ellerman 提交于 7月 25, 2016

The recent commit to rework the hash MMU setup broke the build when
CONFIG_PPC_NATIVE=n. Fix it by adding an IS_ENABLED() check before
calling hpte_init_native().

Removing the else clause opens the possibility that we don't set any
ops, which would probably lead to a strange crash later. So add a check
that we correctly initialised at least one member of the struct.

Fixes: 166dd7d3 ("powerpc/64: Move MMU backend selection out of platform code")
Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NStephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

7353644f

arm64: mm: run pgtable_page_ctor() on non-swapper translation table pages · 1378dc3d

由 Ard Biesheuvel 提交于 7月 22, 2016

The kernel page table creation routines are accessible to other subsystems
(e.g., EFI) via the create_pgd_mapping() entry point, which allows mappings
to be created that are not covered by init_mm.

Since generic code such as apply_to_page_range() may expect translation
table pages that are not associated with init_mm to be covered by fully
constructed struct pages, add a call to pgtable_page_ctor() in the alloc
function used by create_pgd_mapping. Since it is no longer used by
create_mapping_late(), also update the name of this function to better
reflect its purpose.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NLaura Abbott <labbott@redhat.com>
Tested-by: NSudeep Holla <sudeep.holla@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

1378dc3d

arm64: mm: make create_mapping_late() non-allocating · 258a1605

由 Ard Biesheuvel 提交于 7月 22, 2016

The only purpose served by create_mapping_late() is to remap the already
mapped .text and .rodata kernel segments with read-only permissions. Since
we no longer allow block mappings to be split or merged,
create_mapping_late() should not pass an allocation function pointer into
__create_pgd_mapping(). So pass NULL instead.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: NLaura Abbott <labbott@redhat.com>
Tested-by: NSudeep Holla <sudeep.holla@arm.com>
Acked-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>

258a1605

tile: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO · cdf8b463

由 James Hogan 提交于 7月 25, 2016

AT_VECTOR_SIZE_ARCH should be defined with the maximum number of
NEW_AUX_ENT entries that ARCH_DLINFO can contain, but it wasn't defined
for tile at all even though ARCH_DLINFO will contain one NEW_AUX_ENT for
the VDSO address.

This shouldn't be a problem as AT_VECTOR_SIZE_BASE includes space for
AT_BASE_PLATFORM which tile doesn't use, but lets define it now and add
the comment above ARCH_DLINFO as found in several other architectures to
remind future modifiers of ARCH_DLINFO to keep AT_VECTOR_SIZE_ARCH up to
date.

Fixes: 4a556f4f ("tile: implement gettimeofday() via vDSO")
Signed-off-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Signed-off-by: NChris Metcalf <cmetcalf@mellanox.com>

cdf8b463

25 7月, 2016 2 次提交

x86: Make the vdso2c compiler use the host architecture headers · d51306f1

由 Stephen Rothwell 提交于 7月 23, 2016

To be clear: this is a ppc64le hosted, x86_64 target cross build.
Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
Acked-by: NAndy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160723150845.3af8e452@canb.auug.org.auSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

d51306f1

MIPS: ath79: Add missing include file · 20ff3ada

由 Guenter Roeck 提交于 7月 23, 2016

Commit ddd0ce87 ("mips: Remove unnecessary of_platform_populate with
default match table") dropped the include of linux/clk-provider.h from
arch/mips/ath79/setup.c. This results in the following build error.

arch/mips/ath79/setup.c: In function 'ath79_of_plat_time_init':
arch/mips/ath79/setup.c:232:2: error:
	implicit declaration of function 'of_clk_init'

Fixes: ddd0ce87 ("mips: Remove unnecessary of_platform_populate with default match table")
Cc: Rob Herring <robh@kernel.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NRob Herring <robh@kernel.org>

20ff3ada

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功