提交 · aa069a996951f3e2e38437ef0316685a5893fc7e · openeuler / Kernel

09 10月, 2018 2 次提交

KVM: PPC: Book3S HV: Add a VM capability to enable nested virtualization · aa069a99

由 Paul Mackerras 提交于 9月 21, 2018

With this, userspace can enable a KVM-HV guest to run nested guests
under it.

The administrator can control whether any nested guests can be run;
setting the "nested" module parameter to false prevents any guests
becoming nested hypervisors (that is, any attempt to enable the nested
capability on a guest will fail).  Guests which are already nested
hypervisors will continue to be so.
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>

aa069a99

KVM: PPC: Book3S HV: Add one-reg interface to virtual PTCR register · 30323418

由 Paul Mackerras 提交于 10月 08, 2018

This adds a one-reg register identifier which can be used to read and
set the virtual PTCR for the guest. This register identifies the
address and size of the virtual partition table for the guest, which
contains information about the nested guests under this guest.

Migrating this value is the only extra requirement for migrating a
guest which has nested guests (assuming of course that the destination
host supports nested virtualization in the kvm-hv module).
Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>

30323418

28 9月, 2018 1 次提交

s390: doc: detailed specifications for AP virtualization · 492a6be1

由 Tony Krowiak 提交于 9月 25, 2018

This patch provides documentation describing the AP architecture and
design concepts behind the virtualization of AP devices. It also
includes an example of how to configure AP devices for exclusive
use of KVM guests.
Signed-off-by: NTony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: NHalil Pasic <pasic@linux.ibm.com>
Message-Id: <20180925231641.4954-27-akrowiak@linux.vnet.ibm.com>
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>

492a6be1

20 9月, 2018 1 次提交

KVM: x86: Control guest reads of MSR_PLATFORM_INFO · 6fbbde9a

由 Drew Schmitt 提交于 8月 20, 2018

Add KVM_CAP_MSR_PLATFORM_INFO so that userspace can disable guest access
to reads of MSR_PLATFORM_INFO.

Disabling access to reads of this MSR gives userspace the control to "expose"
this platform-dependent information to guests in a clear way. As it exists
today, guests that read this MSR would get unpopulated information if userspace
hadn't already set it (and prior to this patch series, only the CPUID faulting
information could have been populated). This existing interface could be
confusing if guests don't handle the potential for incorrect/incomplete
information gracefully (e.g. zero reported for base frequency).
Signed-off-by: NDrew Schmitt <dasch@google.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

6fbbde9a

17 9月, 2018 2 次提交

ARM: dts: at91: add new compatibility string for macb on sama5d3 · 321cc359

由 Nicolas Ferre 提交于 9月 14, 2018

We need this new compatibility string as we experienced different behavior
for this 10/100Mbits/s macb interface on this particular SoC.
Backward compatibility is preserved as we keep the alternative strings.
Signed-off-by: NNicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

321cc359

Code of Conduct: Let's revamp it. · 8a104f8b

由 Greg Kroah-Hartman 提交于 9月 15, 2018

The Code of Conflict is not achieving its implicit goal of fostering
civility and the spirit of 'be excellent to each other'.  Explicit
guidelines have demonstrated success in other projects and other areas
of the kernel.

Here is a Code of Conduct statement for the wider kernel.  It is based
on the Contributor Covenant as described at www.contributor-covenant.org

From this point forward, we should abide by these rules in order to help
make the kernel community a welcoming environment to participate in.
Signed-off-by: NChris Mason <clm@fb.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>
Signed-off-by: NOlof Johansson <olof@lxom.net>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8a104f8b

14 9月, 2018 1 次提交

xen/balloon: add runtime control for scrubbing ballooned out pages · 197ecb38

由 Marek Marczykowski-Górecki 提交于 9月 07, 2018

Scrubbing pages on initial balloon down can take some time, especially
in nested virtualization case (nested EPT is slow). When HVM/PVH guest is
started with memory= significantly lower than maxmem=, all the extra
pages will be scrubbed before returning to Xen. But since most of them
weren't used at all at that point, Xen needs to populate them first
(from populate-on-demand pool). In nested virt case (Xen inside KVM)
this slows down the guest boot by 15-30s with just 1.5GB needed to be
returned to Xen.

Add runtime parameter to enable/disable it, to allow initially disabling
scrubbing, then enable it back during boot (for example in initramfs).
Such usage relies on assumption that a) most pages ballooned out during
initial boot weren't used at all, and b) even if they were, very few
secrets are in the guest at that time (before any serious userspace
kicks in).
Convert CONFIG_XEN_SCRUB_PAGES to CONFIG_XEN_SCRUB_PAGES_DEFAULT (also
enabled by default), controlling default value for the new runtime
switch.
Signed-off-by: NMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>

197ecb38

12 9月, 2018 1 次提交

KVM: s390: Make huge pages unavailable in ucontrol VMs · 40ebdb8e

由 Janosch Frank 提交于 8月 01, 2018

We currently do not notify all gmaps when using gmap_pmdp_xchg(), due
to locking constraints. This makes ucontrol VMs, which is the only VM
type that creates multiple gmaps, incompatible with huge pages. Also
we would need to hold the guest_table_lock of all gmaps that have this
vmaddr maped to synchronize access to the pmd.

ucontrol VMs are rather exotic and creating a new locking concept is
no easy task. Hence we return EINVAL when trying to active
KVM_CAP_S390_HPAGE_1M and report it as being not available when
checking for it.

Fixes: a4499382 ("KVM: s390: Add huge page enablement control")
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>
Reviewed-by: NDavid Hildenbrand <david@redhat.com>
Reviewed-by: NClaudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20180801112508.138159-1-frankja@linux.ibm.com>
Signed-off-by: NJanosch Frank <frankja@linux.ibm.com>

40ebdb8e

10 9月, 2018 1 次提交

x86/doc: Fix Documentation/x86/earlyprintk.txt · 07e846ba

由 Randy Dunlap 提交于 8月 05, 2018

Fix a few issues in Documentation/x86/earlyprintk.txt:

- correct typos, punctuation, missing word, wrong word
- change product name from Netchip to NetChip
- expand where to add "earlyprintk=dbg"
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Link: http://lkml.kernel.org/r/d0c40ac3-7659-6374-dbda-23d3d2577f30@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

07e846ba

07 9月, 2018 1 次提交

dm raid: bump target version, update comments and documentation · 5380c05b

由 Heinz Mauelshagen 提交于 9月 06, 2018

Bump target version to reflect the documented fixes are available.
Also fix some code comments (typos and clarity).
Signed-off-by: NHeinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: NMike Snitzer <snitzer@redhat.com>

5380c05b

03 9月, 2018 3 次提交

dt-bindings: imx-lpi2c: Remove mx8dv compatible entry · f6eb8934

由 Fabio Estevam 提交于 8月 31, 2018

mx8dv never entered into production and there is no other place
in the kernel referring to this SoC, so remove it from the
dt bindings documentation.
Signed-off-by: NFabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>

f6eb8934

dt-bindings: net: cpsw: Document cpsw-phy-sel usage but prefer phandle · 10d7fac4

由 Tony Lindgren 提交于 8月 29, 2018

The current cpsw usage for cpsw-phy-sel is undocumented but is used for
all the boards using cpsw. And cpsw-phy-sel is not really a child of
the cpsw device, it lives in the system control module instead.

Let's document the existing usage, and improve it a bit where we prefer
to use a phandle instead of a child device for it. That way we can
properly describe the hardware in dts files for things like genpd.

Cc: devicetree@vger.kernel.org
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Rob Herring <robh+dt@kernel.org>
Signed-off-by: NTony Lindgren <tony@atomide.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

10d7fac4

kconfig: do not require pkg-config on make {menu,n}config · fd65465b

由 Masahiro Yamada 提交于 8月 31, 2018

Meelis Roos reported a {menu,n}config regression:
 "I have libncurses devel package installed in the default system
  location (as do 99%+ on actual developers probably) and in this
  case, pkg-config is useless.  pkg-config is needed only when
  libraries and headers are installed in non-default locations but
  it is bad to require installation of pkg-config on all the machines
  where make menuconfig would be possibly run."

For {menu,n}config, do not use pkg-config if it is not installed.
For {g,x}config, keep checking pkg-config since we really rely on it
for finding the installation paths of the required packages.

Fixes: 4ab3b801 ("kconfig: check for pkg-config on make {menu,n,g,x}config")
Reported-by: NMeelis Roos <mroos@linux.ee>
Signed-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: NMeelis Roos <mroos@linux.ee>
Tested-by: NRandy Dunlap <rdunlap@infradead.org>

fd65465b

02 9月, 2018 1 次提交

random: make CPU trust a boot parameter · 9b254366

由 Kees Cook 提交于 8月 27, 2018

Instead of forcing a distro or other system builder to choose
at build time whether the CPU is trusted for CRNG seeding via
CONFIG_RANDOM_TRUST_CPU, provide a boot-time parameter for end users to
control the choice. The CONFIG will set the default state instead.
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NTheodore Ts'o <tytso@mit.edu>

9b254366

31 8月, 2018 1 次提交

i2c: refactor function to release a DMA safe buffer · 82fe39a6

由 Wolfram Sang 提交于 8月 24, 2018

a) rename to 'put' instead of 'release' to match 'get' when obtaining
   the buffer
b) change the argument order to have the buffer as first argument
c) add a new argument telling the function if the message was
   transferred. This allows the function to be used also in cases
   where setting up DMA failed, so the buffer needs to be freed without
   syncing to the message buffer.

Also convert the only user.
Signed-off-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: NNiklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>

82fe39a6

30 8月, 2018 4 次提交

vfs: add the fadvise() file operation · 45cd0faa

由 Amir Goldstein 提交于 8月 27, 2018

This is going to be used by overlayfs and possibly useful
for other filesystems.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

45cd0faa

Documentation/filesystems: update documentation of file_operations · 17ef445f

由 Amir Goldstein 提交于 8月 27, 2018

...to kernel 4.18.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

17ef445f

sh_eth: Add R7S9210 support · 6e0bb04d

由 Chris Brandt 提交于 8月 27, 2018

Add support for the R7S9210 which is part of the RZ/A2 series.
Signed-off-by: NChris Brandt <chris.brandt@renesas.com>
Acked-by: NRob Herring <robh@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6e0bb04d

dt-bindings: watchdog: renesas-wdt: Document r8a774a1 support · 74081c9f

由 Fabrizio Castro 提交于 8月 14, 2018

RZ/G2M (R8A774A1) watchdog implementation is compatible with R-Car
Gen3, therefore add relevant documentation.
Signed-off-by: NFabrizio Castro <fabrizio.castro@bp.renesas.com>
Reviewed-by: NBiju Das <biju.das@bp.renesas.com>
Reviewed-by: NRob Herring <robh@kernel.org>
Reviewed-by: NSimon Horman <horms+renesas@verge.net.au>
Reviewed-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NWim Van Sebroeck <wim@linux-watchdog.org>

74081c9f

29 8月, 2018 3 次提交

Documentation/arm64/sve: Couple of improvements and typos · afce0cc9

由 Julien Grall 提交于 8月 14, 2018

  - Fix mismatch between SVE registers (Z) and FPSIMD register (V)
  - Don't prefix the path for [3] with Linux to stay consistent with
    [1] and [2].
Signed-off-by: NJulien Grall <julien.grall@arm.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

afce0cc9

xen: export device state to sysfs · 076e2ced

由 Joe Jin 提交于 8月 28, 2018

Export device state to sysfs to allow for easier get device state.
Signed-off-by: NJoe Jin <joe.jin@oracle.com>
Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>

076e2ced

dt-bindings: riscv,cpu-intc: Cleanups from a missed review · 11f65ad1

由 Palmer Dabbelt 提交于 8月 20, 2018

I managed to miss one of Rob's code reviews on the mailing list
<http://lists.infradead.org/pipermail/linux-riscv/2018-August/001139.html>.
The patch has already been merged, so I'm submitting a fixup.

Sorry!

Fixes: b67bc7cb ("dt-bindings: interrupt-controller: RISC-V local interrupt controller")
Cc: Rob Herring <robh@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Karsten Merker <merker@debian.org>
Signed-off-by: NPalmer Dabbelt <palmer@sifive.com>

11f65ad1

28 8月, 2018 2 次提交

scsi: documentation: add scsi_mod.use_blk_mq to scsi-parameters · a7ccd92c

由 John Pittman 提交于 8月 23, 2018

Kernel line argument scsi_mod.use_blk_mq is missing from file
Documentation/scsi/scsi-parameters.txt.  Add this option, providing
mention of config setting and format.

[mkp: clarified where to look]
Signed-off-by: NJohn Pittman <jpittman@redhat.com>
Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>

a7ccd92c

xen/blkback: don't keep persistent grants too long · 973e5405

由 Juergen Gross 提交于 8月 13, 2018

Persistent grants are allocated until a threshold per ring is being
reached. Those grants won't be freed until the ring is being destroyed
meaning there will be resources kept busy which might no longer be
used.

Instead of freeing only persistent grants until the threshold is
reached add a timestamp and remove all persistent grants not having
been in use for a minute.
Signed-off-by: NJuergen Gross <jgross@suse.com>
Reviewed-by: NRoger Pau Monné <roger.pau@citrix.com>
Signed-off-by: NKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>

973e5405

27 8月, 2018 1 次提交

hwmon: (ina2xx) fix sysfs shunt resistor read access · 3ad86700

由 Lothar Felten 提交于 8月 14, 2018

fix the sysfs shunt resistor read access: return the shunt resistor
value, not the calibration register contents.

update email address
Signed-off-by: NLothar Felten <lothar.felten@gmail.com>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>

3ad86700

24 8月, 2018 9 次提交

arm64: dts: Fix various entry-method properties to reflect documentation · e9880240

由 Amit Kucheria 提交于 8月 23, 2018

The idle-states binding documentation[1] mentions that the
'entry-method' property is required on 64-bit platforms and must be
set to "psci".

commit a13f18f5 ("Documentation: arm: Fix typo in the idle-states
bindings examples") attempted to fix this earlier but clearly more is
needed.

Fix the cpu-capacity.txt documentation that uses the incorrect value so
we don't get copy-paste errors like these. Clarify the language in
idle-states.txt by removing the reference to the psci bindings that
might be causing this confusion.

Finally, fix devicetrees of various boards to reflect current
documentation.

[1] Documentation/devicetree/bindings/arm/idle-states.txt (see
idle-states node)
Signed-off-by: NAmit Kucheria <amit.kucheria@linaro.org>
Acked-by: NSudeep Holla <sudeep.holla@arm.com>
Acked-by: NLi Yang <leoyang.li@nxp.com>
Signed-off-by: NOlof Johansson <olof@lixom.net>

e9880240

i2c: ocores: update my email address · 5d3a01a2

由 Peter Korsgaard 提交于 8月 22, 2018

The old @sunsite.dk address is no longer active, so update the references.
Signed-off-by: NPeter Korsgaard <peter@korsgaard.com>
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>

5d3a01a2

treewide: convert ISO_8859-1 text comments to utf-8 · 3723c632

由 Arnd Bergmann 提交于 8月 23, 2018

Almost all files in the kernel are either plain text or UTF-8 encoded.  A
couple however are ISO_8859-1, usually just a few characters in a C
comments, for historic reasons.

This converts them all to UTF-8 for consistency.

Link: http://lkml.kernel.org/r/20180724111600.4158975-1-arnd@arndb.deSigned-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: Simon Horman <horms@verge.net.au>			[IPVS portion]
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>	[IIO]
Acked-by: Michael Ellerman <mpe@ellerman.id.au>			[powerpc]
Acked-by: NRob Herring <robh@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Rob Herring <robh+dt@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3723c632

docs/core-api: mm-api: add section about GFP flags · 038a07a5

由 Mike Rapoport 提交于 8月 23, 2018

Link: http://lkml.kernel.org/r/1532626360-16650-8-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

038a07a5

docs/core-api: split memory management API to a separate file · 41f35b39

由 Mike Rapoport 提交于 8月 23, 2018

This is basically copy-paste of the memory management section from
kernel-api.rst with some minor adjustments:

* The "User Space Memory Access" is moved to the beginning
* The get_user_pages_fast reference is now a part of "User Space Memory
  Access"
* And, of course, headings are adjusted with section being promoted to
  chapters

Link: http://lkml.kernel.org/r/1532626360-16650-6-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

41f35b39

docs/core-api: move *{str,mem}dup* to "String Manipulation" · 15956176

由 Mike Rapoport 提交于 8月 23, 2018

The string and memory duplication routines fit better to the "String
Manipulation" section than to "The SLAB Cache".

Link: http://lkml.kernel.org/r/1532626360-16650-5-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

15956176

docs/core-api: kill trailing whitespace in kernel-api.rst · 7463f650

由 Mike Rapoport 提交于 8月 23, 2018

Link: http://lkml.kernel.org/r/1532626360-16650-4-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7463f650

treewide: correct "differenciate" and "instanciate" typos · 3cc97bea

由 Finn Thain 提交于 8月 23, 2018

Also add these typos to spelling.txt so checkpatch.pl will look for them.

Link: http://lkml.kernel.org/r/88af06b9de34d870cb0afc46cfd24e0458be2575.1529471371.git.fthain@telegraphics.com.auSigned-off-by: NFinn Thain <fthain@telegraphics.com.au>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3cc97bea

namei: allow restricted O_CREAT of FIFOs and regular files · 30aba665

由 Salvatore Mesoraca 提交于 8月 23, 2018

Disallows open of FIFOs or regular files not owned by the user in world
writable sticky directories, unless the owner is the same as that of the
directory or the file is opened without the O_CREAT flag.  The purpose
is to make data spoofing attacks harder.  This protection can be turned
on and off separately for FIFOs and regular files via sysctl, just like
the symlinks/hardlinks protection.  This patch is based on Openwall's
"HARDEN_FIFO" feature by Solar Designer.

This is a brief list of old vulnerabilities that could have been prevented
by this feature, some of them even allow for privilege escalation:

CVE-2000-1134
CVE-2007-3852
CVE-2008-0525
CVE-2009-0416
CVE-2011-4834
CVE-2015-1838
CVE-2015-7442
CVE-2016-7489

This list is not meant to be complete.  It's difficult to track down all
vulnerabilities of this kind because they were often reported without any
mention of this particular attack vector.  In fact, before
hardlinks/symlinks restrictions, fifos/regular files weren't the favorite
vehicle to exploit them.

[s.mesoraca16@gmail.com: fix bug reported by Dan Carpenter]
  Link: https://lkml.kernel.org/r/20180426081456.GA7060@mwanda
  Link: http://lkml.kernel.org/r/1524829819-11275-1-git-send-email-s.mesoraca16@gmail.com
[keescook@chromium.org: drop pr_warn_ratelimited() in favor of audit changes in the future]
[keescook@chromium.org: adjust commit subjet]
Link: http://lkml.kernel.org/r/20180416175918.GA13494@beastSigned-off-by: NSalvatore Mesoraca <s.mesoraca16@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Suggested-by: NSolar Designer <solar@openwall.com>
Suggested-by: NKees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

30aba665

23 8月, 2018 6 次提交

ipc: reorganize initialization of kern_ipc_perm.seq · e2652ae6

由 Manfred Spraul 提交于 8月 21, 2018

ipc_addid() initializes kern_ipc_perm.seq after having called idr_alloc()
(within ipc_idr_alloc()).

Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_check()
may see an uninitialized value.

The patch moves the initialization of kern_ipc_perm.seq before the calls
of idr_alloc().

Notes:
1) This patch has a user space visible side effect:
If /proc/sys/kernel/*_next_id is used (i.e.: checkpoint/restore) and
if semget()/msgget()/shmget() fails in the final step of adding the id
to the rhash tree, then .._next_id is cleared. Before the patch, is
remained unmodified.

There is no change of the behavior after a successful ..get() call: It
always clears .._next_id, there is no impact to non checkpoint/restore
code as that code does not use .._next_id.

2) The patch correctly documents that after a call to ipc_idr_alloc(),
the full tear-down sequence must be used. The callers of ipc_addid()
do not fullfill that, i.e. more bugfixes are required.

The patch is a squash of a patch from Dmitry and my own changes.

Link: http://lkml.kernel.org/r/20180712185241.4017-3-manfred@colorfullife.com
Reported-by: syzbot+2827ef6b3385deb07eaf@syzkaller.appspotmail.com
Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e2652ae6

kernel/hung_task.c: allow to set checking interval separately from timeout · a2e51445

由 Dmitry Vyukov 提交于 8月 21, 2018

Currently task hung checking interval is equal to timeout, as the result
hung is detected anywhere between timeout and 2*timeout.  This is fine for
most interactive environments, but this hurts automated testing setups
(syzbot).  In an automated setup we need to strictly order CPU lockup <
RCU stall < workqueue lockup < task hung < silent loss, so that RCU stall
is not detected as task hung and task hung is not detected as silent
machine loss.  The large variance in task hung detection timeout requires
setting silent machine loss timeout to a very large value (e.g.  if task
hung is 3 mins, then silent loss need to be set to ~7 mins).  The
additional 3 minutes significantly reduce testing efficiency because
usually we crash kernel within a minute, and this can add hours to bug
localization process as it needs to do dozens of tests.

Allow setting checking interval separately from timeout.  This allows to
set timeout to, say, 3 minutes, but checking interval to 10 secs.

The interval is controlled via a new hung_task_check_interval_secs sysctl,
similar to the existing hung_task_timeout_secs sysctl.  The default value
of 0 results in the current behavior: checking interval is equal to
timeout.

[akpm@linux-foundation.org: update hung_task_timeout_max's comment]
Link: http://lkml.kernel.org/r/20180611111004.203513-1-dvyukov@google.comSigned-off-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a2e51445

/proc/meminfo: add percpu populated pages count · 7e8a6304

由 Dennis Zhou (Facebook) 提交于 8月 21, 2018

Currently, percpu memory only exposes allocation and utilization
information via debugfs.  This more or less is only really useful for
understanding the fragmentation and allocation information at a per-chunk
level with a few global counters.  This is also gated behind a config.
BPF and cgroup, for example, have seen an increase in use causing
increased use of percpu memory.  Let's make it easier for someone to
identify how much memory is being used.

This patch adds the "Percpu" stat to meminfo to more easily look up how
much percpu memory is in use.  This number includes the cost for all
allocated backing pages and not just insight at the per a unit, per chunk
level.  Metadata is excluded.  I think excluding metadata is fair because
the backing memory scales with the numbere of cpus and can quickly
outweigh the metadata.  It also makes this calculation light.

Link: http://lkml.kernel.org/r/20180807184723.74919-1-dennisszhou@gmail.comSigned-off-by: NDennis Zhou <dennisszhou@gmail.com>
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NRoman Gushchin <guro@fb.com>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7e8a6304

mm, oom: introduce memory.oom.group · 3d8b38eb

由 Roman Gushchin 提交于 8月 21, 2018

For some workloads an intervention from the OOM killer can be painful.
Killing a random task can bring the workload into an inconsistent state.

Historically, there are two common solutions for this
problem:
1) enabling panic_on_oom,
2) using a userspace daemon to monitor OOMs and kill
   all outstanding processes.

Both approaches have their downsides: rebooting on each OOM is an obvious
waste of capacity, and handling all in userspace is tricky and requires a
userspace agent, which will monitor all cgroups for OOMs.

In most cases an in-kernel after-OOM cleaning-up mechanism can eliminate
the necessity of enabling panic_on_oom.  Also, it can simplify the cgroup
management for userspace applications.

This commit introduces a new knob for cgroup v2 memory controller:
memory.oom.group.  The knob determines whether the cgroup should be
treated as an indivisible workload by the OOM killer.  If set, all tasks
belonging to the cgroup or to its descendants (if the memory cgroup is not
a leaf cgroup) are killed together or not at all.

To determine which cgroup has to be killed, we do traverse the cgroup
hierarchy from the victim task's cgroup up to the OOMing cgroup (or root)
and looking for the highest-level cgroup with memory.oom.group set.

Tasks with the OOM protection (oom_score_adj set to -1000) are treated as
an exception and are never killed.

This patch doesn't change the OOM victim selection algorithm.

Link: http://lkml.kernel.org/r/20180802003201.817-4-guro@fb.comSigned-off-by: NRoman Gushchin <guro@fb.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

3d8b38eb

Documentation/sysctl/vm.txt: update __vm_enough_memory()'s path · 85f237a5

由 juviliu 提交于 8月 21, 2018

__vm_enough_memory has moved to mm/util.c.

Link: http://lkml.kernel.org/r/E18EDF4A4FA4A04BBFA824B6D7699E532A7E5913@EXMBX-SZMAIL013.tencent.comSigned-off-by: NJuvi Liu <juviliu@tencent.com>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

85f237a5

mm: clarify CONFIG_PAGE_POISONING and usage · 8c9a134c

由 Kees Cook 提交于 8月 21, 2018

The Kconfig text for CONFIG_PAGE_POISONING doesn't mention that it has to
be enabled explicitly. This updates the documentation for that and adds a
note about CONFIG_PAGE_POISONING to the "page_poison" command line docs.
While here, change description of CONFIG_PAGE_POISONING_ZERO too, as it's
not "random" data, but rather the fixed debugging value that would be used
when not zeroing. Additionally removes a stray "bool" in the Kconfig.

Link: http://lkml.kernel.org/r/20180725223832.GA43733@beastSigned-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8c9a134c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功