提交 · 69101b203590adb9a39ecad7b6789cca5eed2638 · openanolis / cloud-kernel

20 2月, 2016 1 次提交

ARM: dts: am43x-epos-evm: Add the am438 compatible string · 69101b20

由 Keerthy 提交于 2月 19, 2016

The SoCs on am43x-epos-evm are named am438x.
Hence add the compatibility string and remove the am4372 string.
Signed-off-by: NKeerthy <j-keerthy@ti.com>
Signed-off-by: NTony Lindgren <tony@atomide.com>

69101b20

24 1月, 2016 2 次提交

dt/bindings: Add bindings for PIC32/MZDA platforms · 9b9c2cd4

由 Joshua Henderson 提交于 1月 13, 2016

This adds support for the Microchip PIC32 platform along with the
specific variant PIC32MZDA on a PIC32MZDA Starter Kit.
Signed-off-by: NJoshua Henderson <joshua.henderson@microchip.com>
Acked-by: NRob Herring <robh@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Kumar Gala <galak@codeaurora.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12096/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

9b9c2cd4

dt/bindings: Add bindings for PIC32 interrupt controller · edf2194d

由 Cristian Birsan 提交于 1月 13, 2016

Document the devicetree bindings for the interrupt controller on
Microchip PIC32 class devices.
Signed-off-by: NCristian Birsan <cristian.birsan@microchip.com>
Signed-off-by: NJoshua Henderson <joshua.henderson@microchip.com>
Acked-by: NRob Herring <robh@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Kumar Gala <galak@codeaurora.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12093/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

edf2194d

21 1月, 2016 7 次提交

mm: memcontrol: basic memory statistics in cgroup2 memory controller · 587d9f72

由 Johannes Weiner 提交于 1月 20, 2016

Provide a cgroup2 memory.stat that provides statistics on LRU memory
and fault event counters. More consumers and breakdowns will follow.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

587d9f72

Documentation: cgroup: add memory.swap.{current,max} description · 3e24b19d

由 Vladimir Davydov 提交于 1月 20, 2016

The rationale of separate swap counter is given by Johannes Weiner.
Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3e24b19d

mm: memcontrol: allow to disable kmem accounting for cgroup2 · 04823c83

由 Vladimir Davydov 提交于 1月 20, 2016

Kmem accounting might incur overhead that some users can't put up with.
Besides, the implementation is still considered unstable.  So let's
provide a way to disable it for those users who aren't happy with it.

To disable kmem accounting for cgroup2, pass cgroup.memory=nokmem at
boot time.
Signed-off-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

04823c83

dma-mapping: always provide the dma_map_ops based implementation · e1c7e324

由 Christoph Hellwig 提交于 1月 20, 2016

Move the generic implementation to <linux/dma-mapping.h> now that all
architectures support it and remove the HAVE_DMA_ATTR Kconfig symbol now
that everyone supports them.

[valentinrothberg@gmail.com: remove leftovers in Kconfig]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: NValentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e1c7e324

UBSAN: run-time undefined behavior sanity checker · c6d30853

由 Andrey Ryabinin 提交于 1月 20, 2016

UBSAN uses compile-time instrumentation to catch undefined behavior
(UB).  Compiler inserts code that perform certain kinds of checks before
operations that could cause UB.  If check fails (i.e.  UB detected)
__ubsan_handle_* function called to print error message.

So the most of the work is done by compiler.  This patch just implements
ubsan handlers printing errors.

GCC has this capability since 4.9.x [1] (see -fsanitize=undefined
option and its suboptions).
However GCC 5.x has more checkers implemented [2].
Article [3] has a bit more details about UBSAN in the GCC.

[1] - https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Debugging-Options.html
[2] - https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html
[3] - http://developerblog.redhat.com/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

Issues which UBSAN has found thus far are:

Found bugs:

 * out-of-bounds access - 97840cb6 ("netfilter: nfnetlink: fix
   insufficient validation in nfnetlink_bind")

undefined shifts:

 * d48458d4 ("jbd2: use a better hash function for the revoke
   table")

 * 10632008 ("clockevents: Prevent shift out of bounds")

 * 'x << -1' shift in ext4 -
   http://lkml.kernel.org/r/<5444EF21.8020501@samsung.com>

 * undefined rol32(0) -
   http://lkml.kernel.org/r/<1449198241-20654-1-git-send-email-sasha.levin@oracle.com>

 * undefined dirty_ratelimit calculation -
   http://lkml.kernel.org/r/<566594E2.3050306@odin.com>

 * undefined roundown_pow_of_two(0) -
   http://lkml.kernel.org/r/<1449156616-11474-1-git-send-email-sasha.levin@oracle.com>

 * [WONTFIX] undefined shift in __bpf_prog_run -
   http://lkml.kernel.org/r/<CACT4Y+ZxoR3UjLgcNdUm4fECLMx2VdtfrENMtRRCdgHB2n0bJA@mail.gmail.com>

   WONTFIX here because it should be fixed in bpf program, not in kernel.

signed overflows:

 * 32a8df4e ("sched: Fix odd values in effective_load()
   calculations")

 * mul overflow in ntp -
   http://lkml.kernel.org/r/<1449175608-1146-1-git-send-email-sasha.levin@oracle.com>

 * incorrect conversion into rtc_time in rtc_time64_to_tm() -
   http://lkml.kernel.org/r/<1449187944-11730-1-git-send-email-sasha.levin@oracle.com>

 * unvalidated timespec in io_getevents() -
   http://lkml.kernel.org/r/<CACT4Y+bBxVYLQ6LtOKrKtnLthqLHcw-BMp3aqP3mjdAvr9FULQ@mail.gmail.com>

 * [NOTABUG] signed overflow in ktime_add_safe() -
   http://lkml.kernel.org/r/<CACT4Y+aJ4muRnWxsUe1CMnA6P8nooO33kwG-c8YZg=0Xc8rJqw@mail.gmail.com>

[akpm@linux-foundation.org: fix unused local warning]
[akpm@linux-foundation.org: fix __int128 build woes]
Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Yury Gribov <y.gribov@samsung.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c6d30853

sysctl: enable strict writes · 41662f5c

由 Kees Cook 提交于 1月 20, 2016

SYSCTL_WRITES_WARN was added in commit f4aacea2 ("sysctl: allow for
strict write position handling"), and released in v3.16 in August of
2014.  Since then I can find only 1 instance of non-zero offset
writing[1], and it was fixed immediately in CRIU[2].  As such, it
appears safe to flip this to the strict state now.

[1] https://www.google.com/search?q="when%20file%20position%20was%20not%200"
[2] http://lists.openvz.org/pipermail/criu/2015-April/019819.htmlSigned-off-by: NKees Cook <keescook@chromium.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

41662f5c

Documentation/filesystems/vfat.txt: update the limitation for fat fallocate · 28016128

由 Namjae Jeon 提交于 1月 20, 2016

Update the limitation for fat fallocate.
Signed-off-by: NNamjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: NAmit Sahrawat <a.sahrawat@samsung.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

28016128

20 1月, 2016 4 次提交

clk: h8300: Remove "sh73a0-" part from compatible value · c4eb32b1

由 Geert Uytterhoeven 提交于 6月 29, 2015

Drop the bogus "sh73a0-" part (accidentally copied from shmobile?) from
the compatible value.
Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>

c4eb32b1

pipe: limit the per-user amount of pages allocated in pipes · 759c0114

由 Willy Tarreau 提交于 1月 18, 2016

On no-so-small systems, it is possible for a single process to cause an
OOM condition by filling large pipes with data that are never read. A
typical process filling 4000 pipes with 1 MB of data will use 4 GB of
memory. On small systems it may be tricky to set the pipe max size to
prevent this from happening.

This patch makes it possible to enforce a per-user soft limit above
which new pipes will be limited to a single page, effectively limiting
them to 4 kB each, as well as a hard limit above which no new pipes may
be created for this user. This has the effect of protecting the system
against memory abuse without hurting other users, and still allowing
pipes to work correctly though with less data at once.

The limit are controlled by two new sysctls : pipe-user-pages-soft, and
pipe-user-pages-hard. Both may be disabled by setting them to zero. The
default soft limit allows the default number of FDs per process (1024)
to create pipes of the default size (64kB), thus reaching a limit of 64MB
before starting to create only smaller pipes. With 256 processes limited
to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB =
1084 MB of memory allocated for a user. The hard limit is disabled by
default to avoid breaking existing applications that make intensive use
of pipes (eg: for splicing).

Reported-by: socketpair@gmail.com
Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Mitigates: CVE-2013-4312 (Linux 2.0+)
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

759c0114

MIPS: Add IEEE Std 754 conformance mode selection · 503943e0

由 Maciej W. Rozycki 提交于 11月 13, 2015

Add an `ieee754=' kernel parameter to control IEEE Std 754 conformance
mode.

Use separate flags copied from the respective CPU feature flags, and
adjusted according to the conformance mode selected, to make binaries
requesting individual NaN encoding modes accepted or rejected as needed.
Update the initial setting for FCSR and, in the full FPU emulation mode,
its read-only mask accordingly.  Accept the mode selection requested for
legacy processors as well.

As with the EF_MIPS_NAN2008 ELF file header flag adjust both ABS2008 and
NAN2008 bits at the same time, to match the choice made for hardware
currently implemented.
Signed-off-by: NMaciej W. Rozycki <macro@imgtec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Fortune <Matthew.Fortune@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11481/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

503943e0

Documentation: DT: net: add docs for ralink/mediatek SoC ethernet binding · 663148e4

由 John Crispin 提交于 1月 03, 2016

Add three files. ralink,rt2880-net.txt  descibes the actual frame engine
and the other two describe the switch forntend bindings.
Signed-off-by: NJohn Crispin <blogic@openwrt.org>
Signed-off-by: NFelix Fietkau <nbd@openwrt.org>
Signed-off-by: NMichael Lee <igvtee@gmail.com>
Cc: devicetree@vger.kernel.org
Cc: David S. Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-mediatek@lists.infradead.org
Cc: John Crispin <blogic@openwrt.org>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Michael Lee <igvtee@gmail.com>
Cc: steven.liu@mediatek.com
Cc: Fred.Chang@mediatek.com
Patchwork: https://patchwork.linux-mips.org/patch/11970/Signed-off-by: NRalf Baechle <ralf@linux-mips.org>

663148e4

18 1月, 2016 1 次提交

dmaengine: rcar-dmac: Document SoC specific bindings · 6bf64103

由 Simon Horman 提交于 11月 12, 2015

In general Renesas hardware is not documented to the extent where the
relationship between IP blocks on different SoCs can be assumed although
they may appear to operate the same way. Furthermore the documentation
typically does not specify a version for individual IP blocks. For these
reasons a convention of using the SoC name in place of a version and
providing SoC-specific compat strings has been adopted.

Although not universally liked this convention is used in the bindings for
most drivers for Renesas hardware. The purpose of this patch is to
update the Renesas R-Car DMA Controller driver to follow this convention.

Cc: devicetree@vger.kernel.org
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: NSimon Horman <horms+renesas@verge.net.au>
Acked-by: NGeert Uytterhoeven <geert+renesas@glider.be>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NVinod Koul <vinod.koul@intel.com>

6bf64103

17 1月, 2016 1 次提交

printk-formats.txt: remove unimplemented %pT · 64c734be

由 Rasmus Villemoes 提交于 1月 15, 2016

%pT for task->comm has been proposed (several times, I think), but is
not actually implemented.  Remove it from printk-formats.txt and add it
back if/when it gets implemented.
Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: NAndy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

64c734be

16 1月, 2016 3 次提交

thp: update documentation · a46e6376

由 Kirill A. Shutemov 提交于 1月 15, 2016

The patch updates Documentation/vm/transhuge.txt to reflect changes in
THP design.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: NJerome Marchand <jmarchan@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a46e6376

mm, thp: remove infrastructure for handling splitting PMDs · 4b471e88

由 Kirill A. Shutemov 提交于 1月 15, 2016

With new refcounting we don't need to mark PMDs splitting.  Let's drop
code to handle this.
Signed-off-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: NSasha Levin <sasha.levin@oracle.com>
Tested-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NJerome Marchand <jmarchan@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4b471e88

dts: hisi: fixes no syscon fault when init mdio · b70ce2ab

由 yankejian 提交于 1月 13, 2016

When linux start up, we get the log below:
"Hi-HNS_MDIO 803c0000.mdio: no syscon hisilicon,peri-c-subctrl
mdio_bus mdio@803c0000: mdio sys ctl reg has not maped"

The source code about the subctrl is dealt syscon, but dts doesn't.
It cause such fault, so this patch adds the syscon info on dts files to
fixes it.
Signed-off-by: NKejian Yan <yankejian@huawei.com>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b70ce2ab

15 1月, 2016 10 次提交

Documentation/filesystems: describe the shared memory usage/accounting · 0bc126d4

由 Rodrigo Freire 提交于 1月 14, 2016

The Shared Memory accounting support is present in Kernel since commit
4b02108a ("mm: oom analysis: add shmem vmstat") and in userland
free(1) since 2014.  This patch updates the Documentation to reflect
this change.
Signed-off-by: NRodrigo Freire <rfreire@redhat.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0bc126d4

mm: memcontrol: account socket memory in unified hierarchy memory controller · f7e1cb6e

由 Johannes Weiner 提交于 1月 14, 2016

Socket memory can be a significant share of overall memory consumed by
common workloads. In order to provide reasonable resource isolation in
the unified hierarchy, this type of memory needs to be included in the
tracking/accounting of a cgroup under active memory resource control.

Overhead is only incurred when a non-root control group is created AND
the memory controller is instructed to track and account the memory
footprint of that group. cgroup.memory=nosocket can be specified on the
boot commandline to override any runtime configuration and forcibly
exclude socket memory from active memory resource control.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Reviewed-by: NVladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f7e1cb6e

mm: mmap: add new /proc tunable for mmap_base ASLR · d07e2259

由 Daniel Cashman 提交于 1月 14, 2016

Address Space Layout Randomization (ASLR) provides a barrier to
exploitation of user-space processes in the presence of security
vulnerabilities by making it more difficult to find desired code/data
which could help an attack.  This is done by adding a random offset to
the location of regions in the process address space, with a greater
range of potential offset values corresponding to better protection/a
larger search-space for brute force, but also to greater potential for
fragmentation.

The offset added to the mmap_base address, which provides the basis for
the majority of the mappings for a process, is set once on process exec
in arch_pick_mmap_layout() and is done via hard-coded per-arch values,
which reflect, hopefully, the best compromise for all systems.  The
trade-off between increased entropy in the offset value generation and
the corresponding increased variability in address space fragmentation
is not absolute, however, and some platforms may tolerate higher amounts
of entropy.  This patch introduces both new Kconfig values and a sysctl
interface which may be used to change the amount of entropy used for
offset generation on a system.

The direct motivation for this change was in response to the
libstagefright vulnerabilities that affected Android, specifically to
information provided by Google's project zero at:

  http://googleprojectzero.blogspot.com/2015/09/stagefrightened.html

The attack presented therein, by Google's project zero, specifically
targeted the limited randomness used to generate the offset added to the
mmap_base address in order to craft a brute-force-based attack.
Concretely, the attack was against the mediaserver process, which was
limited to respawning every 5 seconds, on an arm device.  The hard-coded
8 bits used resulted in an average expected success rate of defeating
the mmap ASLR after just over 10 minutes (128 tries at 5 seconds a
piece).  With this patch, and an accompanying increase in the entropy
value to 16 bits, the same attack would take an average expected time of
over 45 hours (32768 tries), which makes it both less feasible and more
likely to be noticed.

The introduced Kconfig and sysctl options are limited by per-arch
minimum and maximum values, the minimum of which was chosen to match the
current hard-coded value and the maximum of which was chosen so as to
give the greatest flexibility without generating an invalid mmap_base
address, generally a 3-4 bits less than the number of bits in the
user-space accessible virtual address space.

When decided whether or not to change the default value, a system
developer should consider that mmap_base address could be placed
anywhere up to 2^(value) bits away from the non-randomized location,
which would introduce variable-sized areas above and below the mmap_base
address such that the maximum vm_area_struct size may be reduced,
preventing very large allocations.

This patch (of 4):

ASLR only uses as few as 8 bits to generate the random offset for the
mmap base address on 32 bit architectures.  This value was chosen to
prevent a poorly chosen value from dividing the address space in such a
way as to prevent large allocations.  This may not be an issue on all
platforms.  Allow the specification of a minimum number of bits so that
platforms desiring greater ASLR protection may determine where to place
the trade-off.
Signed-off-by: NDaniel Cashman <dcashman@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: NKees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Jeff Vander Stoep <jeffv@google.com>
Cc: Nick Kralevich <nnk@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hector Marco-Gisbert <hecmargi@upv.es>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d07e2259

mm, procfs: breakdown RSS for anon, shmem and file in /proc/pid/status · 8cee852e

由 Jerome Marchand 提交于 1月 14, 2016

There are several shortcomings with the accounting of shared memory
(SysV shm, shared anonymous mapping, mapping of a tmpfs file).  The
values in /proc/<pid>/status and <...>/statm don't allow to distinguish
between shmem memory and a shared mapping to a regular file, even though
theirs implication on memory usage are quite different: during reclaim,
file mapping can be dropped or written back on disk, while shmem needs a
place in swap.

Also, to distinguish the memory occupied by anonymous and file mappings,
one has to read the /proc/pid/statm file, which has a field for the file
mappings (again, including shmem) and total memory occupied by these
mappings (i.e.  equivalent to VmRSS in the <...>/status file.  Getting
the value for anonymous mappings only is thus not exactly user-friendly
(the statm file is intended to be rather efficiently machine-readable).

To address both of these shortcomings, this patch adds a breakdown of
VmRSS in /proc/<pid>/status via new fields RssAnon, RssFile and
RssShmem, making use of the previous preparatory patch.  These fields
tell the user the memory occupied by private anonymous pages, mapped
regular files and shmem, respectively.  Other existing fields in /status
and /statm files are left without change.  The /statm file can be
extended in the future, if there's a need for that.

Example (part of) /proc/pid/status output including the new Rss* fields:

VmPeak:  2001008 kB
VmSize:  2001004 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:      5108 kB
VmRSS:      5108 kB
RssAnon:              92 kB
RssFile:            1324 kB
RssShmem:           3692 kB
VmData:      192 kB
VmStk:       136 kB
VmExe:         4 kB
VmLib:      1784 kB
VmPTE:      3928 kB
VmPMD:        20 kB
VmSwap:        0 kB
HugetlbPages:          0 kB

[vbabka@suse.cz: forward-porting, tweak changelog]
Signed-off-by: NJerome Marchand <jmarchan@redhat.com>
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8cee852e

mm, proc: account for shmem swap in /proc/pid/smaps · c261e7d9

由 Vlastimil Babka 提交于 1月 14, 2016

Currently, /proc/pid/smaps will always show "Swap: 0 kB" for
shmem-backed mappings, even if the mapped portion does contain pages
that were swapped out.  This is because unlike private anonymous
mappings, shmem does not change pte to swap entry, but pte_none when
swapping the page out.  In the smaps page walk, such page thus looks
like it was never faulted in.

This patch changes smaps_pte_entry() to determine the swap status for
such pte_none entries for shmem mappings, similarly to how
mincore_page() does it.  Swapped out shmem pages are thus accounted for.
For private mappings of tmpfs files that COWed some of the pages, swaped
out status of the original shmem pages is naturally ignored.  If some of
the private copies was also swapped out, they are accounted via their
page table swap entries, so the resulting reported swap usage is then a
sum of both swapped out private copies, and swapped out shmem pages that
were not COWed.  No double accounting can thus happen.

The accounting is arguably still not as precise as for private anonymous
mappings, since now we will count also pages that the process in
question never accessed, but another process populated them and then let
them become swapped out.  I believe it is still less confusing and
subtle than not showing any swap usage by shmem mappings at all.
Swapped out counter might of interest of users who would like to prevent
from future swapins during performance critical operation and pre-fault
them at their convenience.  Especially for larger swapped out regions
the cost of swapin is much higher than a fresh page allocation.  So a
differentiation between pte_none vs.  swapped out is important for those
usecases.

One downside of this patch is that it makes /proc/pid/smaps more
expensive for shmem mappings, as we consult the radix tree for each
pte_none entry, so the overal complexity is O(n*log(n)).  I have
measured this on a process that creates a 2GB mapping and dirties single
pages with a stride of 2MB, and time how long does it take to cat
/proc/pid/smaps of this process 100 times.

Private anonymous mapping:

real    0m0.949s
user    0m0.116s
sys     0m0.348s

Mapping of a /dev/shm/file:

real    0m3.831s
user    0m0.180s
sys     0m3.212s

The difference is rather substantial, so the next patch will reduce the
cost for shared or read-only mappings.

In a less controlled experiment, I've gathered pids of processes on my
desktop that have either '/dev/shm/*' or 'SYSV*' in smaps.  This
included the Chrome browser and some KDE processes.  Again, I've run cat
/proc/pid/smaps on each 100 times.

Before this patch:

real    0m9.050s
user    0m0.518s
sys     0m8.066s

After this patch:

real    0m9.221s
user    0m0.541s
sys     0m8.187s

This suggests low impact on average systems.

Note that this patch doesn't attempt to adjust the SwapPss field for
shmem mappings, which would need extra work to determine who else could
have the pages mapped.  Thus the value stays zero except for COWed
swapped out pages in a shmem mapping, which are accounted as usual.
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: NJerome Marchand <jmarchan@redhat.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c261e7d9

mm, documentation: clarify /proc/pid/status VmSwap limitations for shmem · bf9683d6

由 Vlastimil Babka 提交于 1月 14, 2016

This series is based on Jerome Marchand's [1] so let me quote the first
paragraph from there:

There are several shortcomings with the accounting of shared memory
(sysV shm, shared anonymous mapping, mapping to a tmpfs file).  The
values in /proc/<pid>/status and statm don't allow to distinguish
between shmem memory and a shared mapping to a regular file, even though
their implications on memory usage are quite different: at reclaim, file
mapping can be dropped or written back on disk while shmem needs a place
in swap.  As for shmem pages that are swapped-out or in swap cache, they
aren't accounted at all.

The original motivation for myself is that a customer found (IMHO
rightfully) confusing that e.g.  top output for process swap usage is
unreliable with respect to swapped out shmem pages, which are not
accounted for.

The fundamental difference between private anonymous and shmem pages is
that the latter has PTE's converted to pte_none, and not swapents.  As
such, they are not accounted to the number of swapents visible e.g.  in
/proc/pid/status VmSwap row.  It might be theoretically possible to use
swapents when swapping out shmem (without extra cost, as one has to
change all mappers anyway), and on swap in only convert the swapent for
the faulting process, leaving swapents in other processes until they
also fault (so again no extra cost).  But I don't know how many
assumptions this would break, and it would be too disruptive change for
a relatively small benefit.

Instead, my approach is to document the limitation of VmSwap, and
provide means to determine the swap usage for shmem areas for those who
are interested and willing to pay the price, using /proc/pid/smaps.
Because outside of ipcs, I don't think it's possible to currently to
determine the usage at all.  The previous patchset [1] did introduce new
shmem-specific fields into smaps output, and functions to determine the
values.  I take a simpler approach, noting that smaps output already has
a "Swap: X kB" line, where currently X == 0 always for shmem areas.  I
think we can just consider this a bug and provide the proper value by
consulting the radix tree, as e.g.  mincore_page() does.  In the patch
changelog I explain why this is also not perfect (and cannot be without
swapents), but still arguably much better than showing a 0.

The last two patches are adapted from Jerome's patchset and provide a
VmRSS breakdown to RssAnon, RssFile and RssShm in /proc/pid/status.
Hugh noted that this is a welcome addition, and I agree that it might
help e.g.  debugging process memory usage at albeit non-zero, but still
rather low cost of extra per-mm counter and some page flag checks.

[1] http://lwn.net/Articles/611966/

This patch (of 6):

The documentation for /proc/pid/status does not mention that the value
of VmSwap counts only swapped out anonymous private pages, and not
swapped out pages of the underlying shmem objects (for shmem mappings).
This is not obvious, so document this limitation.
Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NJerome Marchand <jmarchan@redhat.com>
Acked-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bf9683d6

Make sure that highmem pages are not added to symlink page cache · e8ecde25

由 Al Viro 提交于 1月 14, 2016

inode_nohighmem() is sufficient to make sure that page_get_link()
won't try to allocate a highmem page.  Moreover, it is sufficient
to make sure that page_symlink/__page_symlink won't do the same
thing.  However, any filesystem that manually preseeds the symlink's
page cache upon symlink(2) needs to make sure that the page it
inserts there won't be a highmem one.

Fortunately, only nfs and shmem have run afoul of that...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

e8ecde25

thermal: add description for integral_cutoff unit · ec3fc58b

由 Leo Yan 提交于 1月 13, 2016

Add more explicitly description for unit of integral_cutoff which used
by power allocator governor.
Signed-off-by: NLeo Yan <leo.yan@linaro.org>
Acked-by: NJavi Merino <javi.merino@arm.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

ec3fc58b

Documentation: update libhugetlbfs site url · ceec86ec

由 SeongJae Park 提交于 1月 13, 2016

The site for libhugetlbfs has moved from sourceforge to github. This
commit updates the old url.
Signed-off-by: NSeongJae Park <sj38.park@gmail.com>
Acked-by: NMike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

ceec86ec

Documentation: Explain pci=conf1,conf2 more verbosely · afd8c084

由 Borislav Petkov 提交于 1月 13, 2016

People complained that setting the PCI config space access mechanism
through "pci=conf1" or "pci=conf2" on the command line is not really
documented. Yeah, can you blame them? Look at what we have now.

So try to improve the situation a bit by explaining what those "conf1"
and "conf2" things actually mean.

See http://wiki.osdev.org/PCI for more info.
Suggested-by: NEric Morton <Eric.Morton@amd.com>
Signed-off-by: NBorislav Petkov <bp@suse.de>
[jc: Added the above URL to the document too]
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

afd8c084

14 1月, 2016 4 次提交

mfd: arizona: Add device tree binding documentation for new clock driver · 0b819951

由 Charles Keepax 提交于 1月 08, 2016

Specify the device tree binding for the input clocks to Arizona devices.
Signed-off-by: NCharles Keepax <ckeepax@opensource.wolfsonmicro.com>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

0b819951

dt-bindings: regulator/mfd: Reorganize S2MPA01 bindings · 5d1d147f

由 Krzysztof Kozlowski 提交于 12月 04, 2015

The mfd/s2mpa01.txt duplicates some of the information about bindings
with old mfd/s2mps11.txt. Now common part exists entirely in
mfd/samsung,sec-core.txt so:
 - add company prefix to file name (regulator/samsung,s2mpa01.txt),
 - remove duplicated information,
 - reorganize the contents to match style of
   regulator/samsung,s2mps11.txt.
Signed-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: NMark Brown <broonie@kernel.org>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

5d1d147f

dt-bindings: regulator/mfd: Reorganize S5M8767 bindings · 27383ca9

由 Krzysztof Kozlowski 提交于 12月 04, 2015

The regulator/s5m8767-regulator.txt duplicates some of the information
about bindings with old mfd/s2mps11.txt. Now common part exists entirely
in mfd/samsung,sec-core.txt so:
 - add company prefix to file name (regulator/samsung,s5m8767.txt),
 - remove duplicated information,
 - reorganize the contents to match style of
   regulator/samsung,s2mps11.txt.
Signed-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: NMark Brown <broonie@kernel.org>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

27383ca9

dt-bindings: regulator/clock/mfd: Reorganize S2MPS-family bindings · a13c7c51

由 Krzysztof Kozlowski 提交于 12月 04, 2015

Bindings for Samsung S2M and S5M family PMICs are in mess. They are
spread over different files and subdirectories in a non-consistent way.
The devices and respective drivers for them share a lot in common so
everything could be organized in a more readable way.

Reorganize the S2MPS11/13/14/15 Device Tree bindings to match the
drivers for this family of devices:
 - move mfd/s2mps11.txt to mfd/samsung,sec-core.txt for the main MFD
   driver (common for entire family),
 - split clock block to clock/samsung,s2mps11.txt,
 - split regulator block to regulator/samsung,s2mps11.txt.
Signed-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: NMichael Turquette <mturquette@baylibre.com>
Acked-by: NRob Herring <robh@kernel.org>
Acked-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NLee Jones <lee.jones@linaro.org>

a13c7c51

13 1月, 2016 2 次提交

Input: gpio-keys - allow setting input device name in DT · c4dc5f8c

由 Laxman Dewangan 提交于 1月 12, 2016

Allow specifying name if input device via device tree property. This helps
userspace code to get name and perform proper event to key mapping in some
cases (for example, on Android).
Signed-off-by: NLaxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>

c4dc5f8c

asm-generic: implement virt_xxx memory barriers · 6a65d263

由 Michael S. Tsirkin 提交于 12月 27, 2015

Guests running within virtual machines might be affected by SMP effects even if
the guest itself is compiled without SMP support.  This is an artifact of
interfacing with an SMP host while running an UP kernel.  Using mandatory
barriers for this use-case would be possible but is often suboptimal.

In particular, virtio uses a bunch of confusing ifdefs to work around
this, while xen just uses the mandatory barriers.

To better handle this case, low-level virt_mb() etc macros are made available.
These are implemented trivially using the low-level __smp_xxx macros,
the purpose of these wrappers is to annotate those specific cases.

These have the same effect as smp_mb() etc when SMP is enabled, but generate
identical code for SMP and non-SMP systems. For example, virtual machine guests
should use virt_mb() rather than smp_mb() when synchronizing against a
(possibly SMP) host.
Suggested-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>

6a65d263

12 1月, 2016 5 次提交

cgroup: rename cgroup documentations · 6255c46f

由 Tejun Heo 提交于 1月 11, 2016

cgroup-legacy may be too loaded.  Rename the docs so that they're
postfixed with v1 and v2.

* s/cgroup-legacy/cgroup-v1/
* s/cgroup.txt/cgroup-v2.txt/
Signed-off-by: NTejun Heo <tj@kernel.org>

6255c46f

DMA-API: fix confusing sentence in Documentation/DMA-API.txt · 000afe89

由 Masahiro Yamada 提交于 1月 04, 2016

Change the phrase "handed off to the driver" to "handed off to the
device" as in the paragraph below.
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

000afe89

Documentation: translations: update linux cross reference link · feb4d888

由 SeongJae Park 提交于 1月 01, 2016

The old link to source code cross reference does not work now. Though
the link has updated by commit 1d12554f
("Documentation: HOWTO: update code cross reference link"), there are
few obsolete links yet. This commit update them.
Signed-off-by: NSeongJae Park <sj38.park@gmail.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

feb4d888

Documentation: fix typo in CodingStyle · 9a2885e6

由 Manuel Pégourié-Gonnard 提交于 12月 28, 2015

Simple typo: "it" for "is".
Signed-off-by: NManuel Pégourié-Gonnard <mpg@elzevir.fr>
Cc: Trivial Patch Monkey <trivial@kernel.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

9a2885e6

f2fs: detect idle time depending on user behavior · d0239e1b

由 Jaegeuk Kim 提交于 1月 08, 2016

This patch adds last time that user requested filesystem operations.
This information is used to detect whether system is idle or not later.
Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>

d0239e1b

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功