提交 · 828cad8ea05d194d8a9452e0793261c2024c23a2 · OpenHarmony / kernel_linux

16 2月, 2017 2 次提交

genirq/msi: Add stubs for get_cached_msi_msg/pci_write_msi_msg · 2f44e29c

由 Arnd Bergmann 提交于 2月 14, 2017

A bug fix to the MSIx handling in vfio added references to functions
that may not be defined if MSI is disabled in the kernel, resulting in
this link error:

drivers/built-in.o: In function `vfio_msi_set_vector_signal':
:(.text+0x450808): undefined reference to `get_cached_msi_msg'
:(.text+0x45080c): undefined reference to `write_msi_msg'

As suggested by Alex Williamson, add stub implementations for
get_cached_msi_msg() and pci_write_msi_msg().

In case this bugfix gets backported, please note that the #ifdef
has changed over time, originally both functions were implemented
in drivers/pci/msi.c and controlled by CONFIG_PCI_MSI, while nowadays
get_cached_msi_msg() is part of the generic MSI support and can be
used without PCI.

Fixes: b8f02af0 ("vfio/pci: Restore MSIx message prior to enabling")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>
Link: http://lkml.kernel.org/r/1413190208.4202.34.camel@ul30vt.home
Link: http://lkml.kernel.org/r/20170214215343.3307861-1-arnd@arndb.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

2f44e29c

rhashtable: Revert nested table changes. · bf3f14d6

由 David S. Miller 提交于 2月 15, 2017

This reverts commits:

6a254780
9dbbfb0a
40137906

It's too risky to put in this late in the release
cycle.  We'll put these changes into the next merge
window instead.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bf3f14d6

14 2月, 2017 1 次提交

rhashtable: Add nested tables · 40137906

由 Herbert Xu 提交于 2月 11, 2017

This patch adds code that handles GFP_ATOMIC kmalloc failure on
insertion.  As we cannot use vmalloc, we solve it by making our
hash table nested.  That is, we allocate single pages at each level
and reach our desired table size by nesting them.

When a nested table is created, only a single page is allocated
at the top-level.  Lower levels are allocated on demand during
insertion.  Therefore for each insertion to succeed, only two
(non-consecutive) pages are needed.

After a nested table is created, a rehash will be scheduled in
order to switch to a vmalloced table as soon as possible.  Also,
the rehash code will never rehash into a nested table.  If we
detect a nested table during a rehash, the rehash will be aborted
and a new rehash will be scheduled.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40137906

13 2月, 2017 1 次提交

bpf: introduce BPF_F_ALLOW_OVERRIDE flag · 7f677633

由 Alexei Starovoitov 提交于 2月 10, 2017

If BPF_F_ALLOW_OVERRIDE flag is used in BPF_PROG_ATTACH command
to the given cgroup the descendent cgroup will be able to override
effective bpf program that was inherited from this cgroup.
By default it's not passed, therefore override is disallowed.

Examples:
1.
prog X attached to /A with default
prog Y fails to attach to /A/B and /A/B/C
Everything under /A runs prog X

2.
prog X attached to /A with allow_override.
prog Y fails to attach to /A/B with default (non-override)
prog M attached to /A/B with allow_override.
Everything under /A/B runs prog M only.

3.
prog X attached to /A with allow_override.
prog Y fails to attach to /A with default.
The user has to detach first to switch the mode.

In the future this behavior may be extended with a chain of
non-overridable programs.

Also fix the bug where detach from cgroup where nothing is attached
was not throwing error. Return ENOENT in such case.

Add several testcases and adjust libbpf.

Fixes: 30070984 ("cgroup: add support for eBPF programs")
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NTejun Heo <tj@kernel.org>
Acked-by: NDaniel Mack <daniel@zonque.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f677633

11 2月, 2017 1 次提交

mtd: name the mtd device with an optional label property · 28309572

由 Cédric Le Goater 提交于 2月 09, 2017

This can be used to easily identify a specific chip on a system with
multiple chips.
Suggested-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: NCédric Le Goater <clg@kaod.org>
Reviewed-by: NMarek Vasut <marek.vasut@gmail.com>
Signed-off-by: NBrian Norris <computersforpeace@gmail.com>

28309572

10 2月, 2017 6 次提交

irqdesc: Add a resource managed version of irq_alloc_descs() · 2b5e7730

由 Bartosz Golaszewski 提交于 2月 10, 2017

Add a devres flavor of __devm_irq_alloc_descs() and corresponding
helper macros.
Signed-off-by: NBartosz Golaszewski <bgolaszewski@baylibre.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-doc@vger.kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/1486729403-21132-1-git-send-email-bgolaszewski@baylibre.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

2b5e7730

mtd: spi-nor: rename SPINOR_OP_* macros of the 4-byte address op codes · 902cc69a

由 Cyrille Pitchen 提交于 10月 27, 2016

This patch renames the SPINOR_OP_* macros of the 4-byte address
instruction set so the new names all share a common pattern: the 4-byte
address name is built from the 3-byte address name appending the "_4B"
suffix.

The patch also introduces new op codes to support other SPI protocols such
as SPI 1-4-4 and SPI 1-2-2.

This is a transitional patch and will help a later patch of spi-nor.c
to automate the translation from the 3-byte address op codes into their
4-byte address version.
Signed-off-by: NCyrille Pitchen <cyrille.pitchen@atmel.com>
Acked-by: NMark Brown <broonie@kernel.org>
Acked-by: NMarek Vasut <marek.vasut@gmail.com>

902cc69a

mtd: spi-nor: Add support for S3AN spi-nor devices · e99ca98f

由 Ricardo Ribalda 提交于 12月 02, 2016

Xilinx Spartan-3AN FPGAs contain an In-System Flash where they keep
their configuration data and (optionally) some user data.

The protocol of this flash follows most of the spi-nor standard. With
the following differences:

- Page size might not be a power of two.
- The address calculation (default addressing mode).
- The spi nor commands used.

Protocol is described on Xilinx User Guide UG333
Signed-off-by: NRicardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Marek Vasut <marek.vasut@gmail.com>
Reviewed-by: NMarek Vasut <marek.vasut@gmail.com>
Signed-off-by: NCyrille Pitchen <cyrille.pitchen@atmel.com>

e99ca98f

time: Remove CONFIG_TIMER_STATS · dfb4357d

由 Kees Cook 提交于 2月 08, 2017

Currently CONFIG_TIMER_STATS exposes process information across namespaces:

kernel/time/timer_list.c print_timer():

        SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);

/proc/timer_list:

 #11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570

Given that the tracer can give the same information, this patch entirely
removes CONFIG_TIMER_STATS.
Suggested-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NKees Cook <keescook@chromium.org>
Acked-by: NJohn Stultz <john.stultz@linaro.org>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: linux-doc@vger.kernel.org
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Xing Gao <xgao01@email.wm.edu>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jessica Frazelle <me@jessfraz.com>
Cc: kernel-hardening@lists.openwall.com
Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-api@vger.kernel.org
Cc: Arjan van de Ven <arjan@linux.intel.com>
Link: http://lkml.kernel.org/r/20170208192659.GA32582@beastSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

dfb4357d

perf/core: Allow kernel filters on CPU events · 6ce77bfd

由 Alexander Shishkin 提交于 1月 26, 2017

While supporting file-based address filters for CPU events requires some
extra context switch handling, kernel address filters are easy, since the
kernel mapping is preserved across address spaces. It is also useful as
it permits tracing scheduling paths of the kernel.

This patch allows setting up kernel filters for CPU events.
Signed-off-by: NAlexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20170126094057.13805-4-alexander.shishkin@linux.intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

6ce77bfd

mtd: Add partition device node to mtd partition devices · 42e9401b

由 Sascha Hauer 提交于 2月 09, 2017

The user visible change here is that mtd partitions get an of_node link
in sysfs.
Signed-off-by: NSascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: NBrian Norris <computersforpeace@gmail.com>

42e9401b

09 2月, 2017 6 次提交

cpumask: use nr_cpumask_bits for parsing functions · 4d59b6cc

由 Tejun Heo 提交于 2月 08, 2017

Commit 513e3d2d ("cpumask: always use nr_cpu_ids in formatting and
parsing functions") converted both cpumask printing and parsing
functions to use nr_cpu_ids instead of nr_cpumask_bits.  While this was
okay for the printing functions as it just picked one of the two output
formats that we were alternating between depending on a kernel config,
doing the same for parsing wasn't okay.

nr_cpumask_bits can be either nr_cpu_ids or NR_CPUS.  We can always use
nr_cpu_ids but that is a variable while NR_CPUS is a constant, so it can
be more efficient to use NR_CPUS when we can get away with it.
Converting the printing functions to nr_cpu_ids makes sense because it
affects how the masks get presented to userspace and doesn't break
anything; however, using nr_cpu_ids for parsing functions can
incorrectly leave the higher bits uninitialized while reading in these
masks from userland.  As all testing and comparison functions use
nr_cpumask_bits which can be larger than nr_cpu_ids, the parsed cpumasks
can erroneously yield false negative results.

This made the taskstats interface incorrectly return -EINVAL even when
the inputs were correct.

Fix it by restoring the parse functions to use nr_cpumask_bits instead
of nr_cpu_ids.

Link: http://lkml.kernel.org/r/20170206182442.GB31078@htj.duckdns.org
Fixes: 513e3d2d ("cpumask: always use nr_cpu_ids in formatting and parsing functions")
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NMartin Steigerwald <martin.steigerwald@teamix.de>
Debugged-by: NBen Hutchings <ben.hutchings@codethink.co.uk>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4d59b6cc

mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers · 0911d004

由 Jan Kara 提交于 2月 08, 2017

Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
code (GFS2 or Lustre can definitely do this).  However VM_FAULT_RETRY
from ->page_mkwrite is completely unhandled by the mm code and results
in locking and writeably mapping the page which definitely is not what
the caller wanted.

Fix Lustre and block_page_mkwrite_ret() used by other filesystems
(notably GFS2) to return VM_FAULT_NOPAGE instead which results in
bailing out from the fault code, the CPU then retries the access, and we
fault again effectively doing what the handler wanted.

Link: http://lkml.kernel.org/r/20170203150729.15863-1-jack@suse.czSigned-off-by: NJan Kara <jack@suse.cz>
Reported-by: NAl Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: NJinshan Xiong <jinshan.xiong@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0911d004

mtd: nand: Add max_bb_per_die and blocks_per_die fields to nand_chip · ceb374eb

由 Zach Brown 提交于 1月 10, 2017

The fields max_bb_per_die and blocks_per_die are useful determining the
number of bad blocks a MTD needs to allocate. How they are set will
depend on if the chip is ONFI, JEDEC or a full-id entry in the nand_ids
table.
Signed-off-by: NZach Brown <zach.brown@ni.com>
Acked-by: NBoris Brezillon <boris.brezillon@free-electron.com>
Acked-by: NBrian Norris <computersforpeace@gmail.com>
Signed-off-by: NBrian Norris <computersforpeace@gmail.com>

ceb374eb

mtd: introduce function max_bad_blocks · 6080ef6e

由 Jeff Westfahl 提交于 1月 10, 2017

If implemented, 'max_bad_blocks' returns the maximum number of bad
blocks to reserve for a MTD. An implementation for NAND is coming soon.
Signed-off-by: NJeff Westfahl <jeff.westfahl@ni.com>
Signed-off-by: NZach Brown <zach.brown@ni.com>
Acked-by: NBoris Brezillon <boris.brezillon@free-electron.com>
Acked-by: NBrian Norris <computersforpeace@gmail.com>
Signed-off-by: NBrian Norris <computersforpeace@gmail.com>

6080ef6e

mtd: bcm47xxsflash: use platform_(set|get)_drvdata · be5e5099

由 Rafał Miłecki 提交于 1月 16, 2017

We have generic place & helpers for storing platform driver data so
there is no reason for using custom priv pointer.

This allows cleaning up struct bcma_sflash from unneeded fields.
Signed-off-by: NRafał Miłecki <rafal@milecki.pl>
Acked-by: NKalle Valo <kvalo@codeaurora.org>
Acked-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: NBrian Norris <computersforpeace@gmail.com>

be5e5099

net: introduce device min_header_len · 217e6fa2

由 Willem de Bruijn 提交于 2月 07, 2017

The stack must not pass packets to device drivers that are shorter
than the minimum link layer header length.

Previously, packet sockets would drop packets smaller than or equal
to dev->hard_header_len, but this has false positives. Zero length
payload is used over Ethernet. Other link layer protocols support
variable length headers. Support for validation of these protocols
removed the min length check for all protocols.

Introduce an explicit dev->min_header_len parameter and drop all
packets below this value. Initially, set it to non-zero only for
Ethernet and loopback. Other protocols can follow in a patch to
net-next.

Fixes: 9ed988cd ("packet: validate variable length ll headers")
Reported-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

217e6fa2

08 2月, 2017 4 次提交

irqchip/gic-v3: Remove duplicate definition of GICD_TYPER_LPIS · e3c484b1

由 Alim Akhtar 提交于 1月 04, 2017

GICD_TYPER_LPIS macro is defined twice in this file. This patch removes the
duplicate entry.

Fixes: f5c1434c ("irqchip: GICv3: rework redistributor structure")
Signed-off-by: NAlim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

e3c484b1

irqchip/gic-v3-its: Rename MAPVI to MAPTI · 6a25ad3a

由 Marc Zyngier 提交于 12月 20, 2016

Back in the days when the GICv3/v4 architecture was drafted,
the command to an event to an LPI number was called MAPVI.
Later on, and to avoid confusion with the GICv4 command VMAPI,
it was renamed MAPTI. We've carried the old name for a long
time, but it gets in the way of people reading the code in
the light of the public architecture specification.

Just repaint all the references and kill the old definition.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

6a25ad3a

irqchip/gic-v3-its: Drop deprecated GITS_BASER_TYPE_CPU · 4f46de9d

由 Marc Zyngier 提交于 12月 20, 2016

During the development of the GICv3/v4 architecture, it was
envisaged to have a CPU table, though the use for it was
never completely clear (the collection table serves that role
pretty well). It ended being dropped before the specification
was published, though it lived on in the driver.

In order to avoid people scratching their head too much, let's do
the same in the kernel.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

4f46de9d

clockevents: Add a clkevt-of mechanism like clksrc-of · 376bc271

由 Daniel Lezcano 提交于 4月 19, 2016

The current code uses the CLOCKSOURCE_OF_DECLARE macro to fill the clksrc
table with a t-uple (name, init_function).

Unfortunately it ends up to the clockevent and the clocksource being
both initialized with this macro. It is not a problem by itself but there
is not a clear distinction between a clockevent and a clocksource in the
code initialization path. Somebody can argue there are the same IP block
and the same DT node. But conceptually from the software side, there are
two distincts entities and as is they should be initialized separetely.
Some drivers which do not have a clocksource end up by using the
CLOCKSOURCE_OF_DECLARE macro to declare a clockevent.

Another result is the fuzzy organization in the clocksource directory,
where the clockevents are implemented in the same file than the
clocksources or file labelled timer-something implementing a clocksource.

This patch provides another macro to specifically declare a clockevent in
the same way than the clocksource and gives the opportunity to write two
separate drivers, one for the clocksource and another for the clockevents.

Hopefully, that can help to do some housework in the directory, perhaps
split the drivers in to entities, for example:
	- clksrc-rockchip.c
	- clkevt-rockchip.c

Also, it gives the possibility to declare clocksources separately in the
DT and then use a clocksource from IP block while while clockevents are
used from another IP block.
Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>

376bc271

07 2月, 2017 3 次提交

delayacct: Include <uapi/linux/taskstats.h> · 4025819d

由 Ingo Molnar 提交于 2月 01, 2017

include/linux/delayacct.h relies on 'struct taskstats' but does
not include the header that defines it.

This worked so far because files that included <linux/taskstats.h>
also happened to include other headers that included
uapi/linux/taskstats.h.

Fix it.

Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4025819d

efi: Get and store the secure boot status · de8cb458

由 David Howells 提交于 2月 06, 2017

Get the firmware's secure-boot status in the kernel boot wrapper and stash
it somewhere that the main kernel image can find.

The efi_get_secureboot() function is extracted from the ARM stub and (a)
generalised so that it can be called from x86 and (b) made to use
efi_call_runtime() so that it can be run in mixed-mode.

For x86, it is stored in boot_params and can be overridden by the boot
loader or kexec.  This allows secure-boot mode to be passed on to a new
kernel.
Suggested-by: NLukas Wunner <lukas@wunner.de>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1486380166-31868-5-git-send-email-ard.biesheuvel@linaro.org
[ Small readability edits. ]
Signed-off-by: NIngo Molnar <mingo@kernel.org>

de8cb458

efi: Add SHIM and image security database GUID definitions · e58910cd

由 Josh Boyer 提交于 2月 06, 2017

Add the definitions for shim and image security database, both of which
are used widely in various Linux distros.
Signed-off-by: NJosh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1486380166-31868-4-git-send-email-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

e58910cd

06 2月, 2017 2 次提交

mtd: nand: Add Winbond manufacturer id · a4077ce5

由 Andrey Jr. Melnikov 提交于 12月 08, 2016

Add WINBOND manufacturer id.
Signed-off-by: NAndrey Jr. Melnikov <temnota.am@gmail.com>
Signed-off-by: NBoris Brezillon <boris.brezillon@free-electrons.com>

a4077ce5

mtd: nand: ifc: Fix location of eccstat registers for IFC V1.0 · 65644147

由 Mark Marshall 提交于 1月 26, 2017

The commit 7a654172 ("mtd/ifc: Add support for IFC controller
version 2.0") added support for version 2.0 of the IFC controller.
The version 2.0 controller has the ECC status registers at a different
location to the previous versions.

Correct the fsl_ifc_nand structure so that the ECC status can be read
from the correct location for both version 1.0 and 2.0 of the controller.

Cc: stable@vger.kernel.org
Fixes: 7a654172 ("mtd/ifc: Add support for IFC controller version 2.0")
Signed-off-by: NMark Marshall <mark.marshall@omicronenergy.com>
Signed-off-by: NBoris Brezillon <boris.brezillon@free-electrons.com>

65644147

04 2月, 2017 5 次提交

tick/broadcast: Reduce lock cacheline contention · 668802c2

由 Waiman Long 提交于 1月 30, 2017

It was observed that on an Intel x86 system without the ARAT (Always
running APIC timer) feature and with fairly large number of CPUs as
well as CPUs coming in and out of intel_idle frequently, the lock
contention on the tick_broadcast_lock can become significant.

To reduce contention, the lock is put into its own cacheline and all
the cpumask_var_t variables are put into the __read_mostly section.

Running the SP benchmark of the NAS Parallel Benchmarks on a 4-socket
16-core 32-thread Nehalam system, the performance number improved
from 3353.94 Mop/s to 3469.31 Mop/s when this patch was applied on
a 4.9.6 kernel. This is a 3.4% improvement.
Signed-off-by: NWaiman Long <longman@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1485799063-20857-1-git-send-email-longman@redhat.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

668802c2

base/memory, hotplug: fix a kernel oops in show_valid_zones() · a96dfddb

由 Toshi Kani 提交于 2月 03, 2017

Reading a sysfs "memoryN/valid_zones" file leads to the following oops
when the first page of a range is not backed by struct page.
show_valid_zones() assumes that 'start_pfn' is always valid for
page_zone().

 BUG: unable to handle kernel paging request at ffffea017a000000
 IP: show_valid_zones+0x6f/0x160

This issue may happen on x86-64 systems with 64GiB or more memory since
their memory block size is bumped up to 2GiB.  [1] An example of such
systems is desribed below.  0x3240000000 is only aligned by 1GiB and
this memory block starts from 0x3200000000, which is not backed by
struct page.

 BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable

Since test_pages_in_a_zone() already checks holes, fix this issue by
extending this function to return 'valid_start' and 'valid_end' for a
given range.  show_valid_zones() then proceeds with the valid range.

[1] 'Commit bdee237c ("x86: mm: Use 2GB memory block size on
    large-memory x86-64 systems")'

Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.comSigned-off-by: NToshi Kani <toshi.kani@hpe.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>	[4.4+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a96dfddb

log2: make order_base_2() behave correctly on const input value zero · 29905b52

由 Ard Biesheuvel 提交于 2月 02, 2017

The function order_base_2() is defined (according to the comment block)
as returning zero on input zero, but subsequently passes the input into
roundup_pow_of_two(), which is explicitly undefined for input zero.

This has gone unnoticed until now, but optimization passes in GCC 7 may
produce constant folded function instances where a constant value of
zero is passed into order_base_2(), resulting in link errors against the
deliberately undefined '____ilog2_NaN'.

So update order_base_2() to adhere to its own documented interface.

[ See

     http://marc.info/?l=linux-kernel&m=147672952517795&w=2

  and follow-up discussion for more background. The gcc "optimization
  pass" is really just broken, but now the GCC trunk problem seems to
  have escaped out of just specially built daily images, so we need to
  work around it in mainline.    - Linus ]
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

29905b52

module: unify absolute krctab definitions for 32-bit and 64-bit · 4b9eee96

由 Ard Biesheuvel 提交于 2月 03, 2017

The previous patch introduced a separate inline asm version of the
krcrctab declaration template for use with 64-bit architectures, which
cannot refer to ELF symbols using 32-bit quantities.

This declaration should be equivalent to the C one for 32-bit
architectures, but just in case - unify them in a separate patch, which
can simply be dropped if it turns out to break anything.
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4b9eee96

modversions: treat symbol CRCs as 32 bit quantities · 71810db2

由 Ard Biesheuvel 提交于 2月 03, 2017

The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.

This has a couple of downsides:

 - Given that the CRCs are treated as memory addresses, we waste 4 bytes
   for each CRC on 64 bit architectures,

 - On architectures that support runtime relocation, a R_<arch>_RELATIVE
   relocation entry is emitted for each CRC value, which identifies it
   as a quantity that requires fixing up based on the actual runtime
   load offset of the kernel. This results in corrupted CRCs unless we
   explicitly undo the fixup (and this is currently being handled in the
   core module code)

 - Such runtime relocation entries take up 24 bytes of __init space
   each, resulting in a x8 overhead in [uncompressed] kernel size for
   CRCs.

Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset.  Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant.  Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.

So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff).  To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.

Note that this mostly reverts commit d4703aef ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

71810db2

03 2月, 2017 1 次提交

ACPI: Add support for ResourceSource/IRQ domain mapping · d44fa3d4

由 Agustin Vega-Frias 提交于 2月 02, 2017

ACPI extended IRQ resources may contain a ResourceSource to specify
an alternate interrupt controller. Introduce acpi_irq_get and use it
to implement ResourceSource/IRQ domain mapping.

The new API is similar to of_irq_get and allows re-initialization
of a platform resource from the ACPI extended IRQ resource, and
provides proper behavior for probe deferral when the domain is not
yet present when called.
Acked-by: NRafael J. Wysocki <rafael@kernel.org>
Acked-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: NHanjun Guo <hanjun.guo@linaro.org>
Tested-by: NHanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: NAgustin Vega-Frias <agustinv@codeaurora.org>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

d44fa3d4

02 2月, 2017 1 次提交

net: fix ndo_features_check/ndo_fix_features comment ordering · 1a2a1444

由 Dimitris Michailidis 提交于 1月 31, 2017

Commit cdba756f ("net: move ndo_features_check() close to
ndo_start_xmit()") inadvertently moved the doc comment for
.ndo_fix_features instead of .ndo_features_check. Fix the comment
ordering.

Fixes: cdba756f ("net: move ndo_features_check() close to ndo_start_xmit()")
Signed-off-by: NDimitris Michailidis <dmichail@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1a2a1444

01 2月, 2017 7 次提交

sched/rt: Show the 'sched_rr_timeslice' SCHED_RR timeslice tuning knob in milliseconds · 975e155e

由 Shile Zhang 提交于 1月 28, 2017

We added the 'sched_rr_timeslice_ms' SCHED_RR tuning knob in this commit:

  ce0dbbbb ("sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice")

... which name suggests to users that it's in milliseconds, while in reality
it's being set in milliseconds but the result is shown in jiffies.

This is obviously confusing when HZ is not 1000, it makes it appear like the
value set failed, such as HZ=100:

  root# echo 100 > /proc/sys/kernel/sched_rr_timeslice_ms
  root# cat /proc/sys/kernel/sched_rr_timeslice_ms
  10

Fix this to be milliseconds all around.
Signed-off-by: NShile Zhang <shile.zhang@nokia.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1485612049-20923-1-git-send-email-shile.zhang@nokia.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

975e155e

sched/clock: Add dummy clear_sched_clock_stable() stub function · 733ce725

由 Paul Gortmaker 提交于 1月 27, 2017

In commit:

  acb04058 ("sched/clock: Fix hotplug crash")

the PARISC code gained a call to this function.

However the prototype for it is within a CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
 #ifdef/#endif.  That, combined with this:

  arch/parisc/Kconfig:    select HAVE_UNSTABLE_SCHED_CLOCK if SMP

means that PARISC can have it either enabled or disabled, resulting
in the following build fail:

  arch/parisc/kernel/setup.c:180:2: error: implicit declaration of
  function 'clear_sched_clock_stable' [-Werror=implicit-function-declaration]

Add a no-op stub for the non-SMP case to prevent this.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: acb04058 ("sched/clock: Fix hotplug crash")
Link: http://lkml.kernel.org/r/20170127173816.22733-1-paul.gortmaker@windriver.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

733ce725

sched/cputime: Remove generic asm headers · b672592f

由 Frederic Weisbecker 提交于 1月 31, 2017

cputime_t is now only used by two architectures:

	* powerpc (when CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y)
	* s390

And since the core doesn't use it anymore, we don't need any arch support
from the others. So we can remove their stub implementations.

A final cleanup would be to provide an efficient pure arch
implementation of cputime_to_nsec() for s390 and powerpc and finally
remove include/linux/cputime.h .
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1485832191-26889-36-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

b672592f

sched/cputime: Remove unused nsec_to_cputime() · 934ad473

由 Frederic Weisbecker 提交于 1月 31, 2017

It's unused now.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1485832191-26889-35-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

934ad473

sched/cputime: Push time to account_system_time() in nsecs · fb8b049c

由 Frederic Weisbecker 提交于 1月 31, 2017

This is one more step toward converting cputime accounting to pure nsecs.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1485832191-26889-25-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

fb8b049c

sched/cputime: Push time to account_idle_time() in nsecs · 18b43a9b

由 Frederic Weisbecker 提交于 1月 31, 2017

This is one more step toward converting cputime accounting to pure nsecs.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1485832191-26889-24-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

18b43a9b

sched/cputime: Push time to account_steal_time() in nsecs · be9095ed

由 Frederic Weisbecker 提交于 1月 31, 2017

This is one more step toward converting cputime accounting to pure nsecs.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Link: http://lkml.kernel.org/r/1485832191-26889-23-git-send-email-fweisbec@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

be9095ed

OpenHarmony / kernel_linux 上一次同步 4 年多

OpenHarmony / kernel_linux
上一次同步 4 年多