提交 · 17a870bea3b86f464706b6ba2736210cb8602693 · gsplhtlxg / clone-Linux

11 1月, 2017 1 次提交

由 Laura Abbott 提交于 1月 10, 2017

Certain architectures may have the kernel image mapped separately to
alias the linear map. Introduce a macro lm_alias to translate a kernel
image symbol into its linear alias. This is used in part with work to
add CONFIG_DEBUG_VIRTUAL support for arm64.
Reviewed-by: NMark Rutland <mark.rutland@arm.com>
Tested-by: NMark Rutland <mark.rutland@arm.com>
Signed-off-by: NLaura Abbott <labbott@redhat.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

568c5fe5

30 12月, 2016 1 次提交

mm: optimize PageWaiters bit use for unlock_page() · b91e1302

由 Linus Torvalds 提交于 12月 27, 2016

In commit 62906027 ("mm: add PageWaiters indicating tasks are
waiting for a page bit") Nick Piggin made our page locking no longer
unconditionally touch the hashed page waitqueue, which not only helps
performance in general, but is particularly helpful on NUMA machines
where the hashed wait queues can bounce around a lot.

However, the "clear lock bit atomically and then test the waiters bit"
sequence turns out to be much more expensive than it needs to be,
because you get a nasty stall when trying to access the same word that
just got updated atomically.

On architectures where locking is done with LL/SC, this would be trivial
to fix with a new primitive that clears one bit and tests another
atomically, but that ends up not working on x86, where the only atomic
operations that return the result end up being cmpxchg and xadd.  The
atomic bit operations return the old value of the same bit we changed,
not the value of an unrelated bit.

On x86, we could put the lock bit in the high bit of the byte, and use
"xadd" with that bit (where the overflow ends up not touching other
bits), and look at the other bits of the result.  However, an even
simpler model is to just use a regular atomic "and" to clear the lock
bit, and then the sign bit in eflags will indicate the resulting state
of the unrelated bit #7.

So by moving the PageWaiters bit up to bit #7, we can atomically clear
the lock bit and test the waiters bit on x86 too.  And architectures
with LL/SC (which is all the usual RISC suspects), the particular bit
doesn't matter, so they are fine with this approach too.

This avoids the extra access to the same atomic word, and thus avoids
the costly stall at page unlock time.

The only downside is that the interface ends up being a bit odd and
specialized: clear a bit in a byte, and test the sign bit.  Nick doesn't
love the resulting name of the new primitive, but I'd rather make the
name be descriptive and very clear about the limitation imposed by
trying to work across all relevant architectures than make it be some
generic thing that doesn't make the odd semantics explicit.

So this introduces the new architecture primitive

    clear_bit_unlock_is_negative_byte();

and adds the trivial implementation for x86.  We have a generic
non-optimized fallback (that just does a "clear_bit()"+"test_bit(7)"
combination) which can be overridden by any architecture that can do
better.  According to Nick, Power has the same hickup x86 has, for
example, but some other architectures may not even care.

All these optimizations mean that my page locking stress-test (which is
just executing a lot of small short-lived shell scripts: "make test" in
the git source tree) no longer makes our page locking look horribly bad.
Before all these optimizations, just the unlock_page() costs were just
over 3% of all CPU overhead on "make test".  After this, it's down to
0.66%, so just a quarter of the cost it used to be.

(The difference on NUMA is bigger, but there this micro-optimization is
likely less noticeable, since the big issue on NUMA was not the accesses
to 'struct page', but the waitqueue accesses that were already removed
by Nick's earlier commit).
Acked-by: NNick Piggin <npiggin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b91e1302

28 12月, 2016 2 次提交

net: xdp: remove unused bfp_warn_invalid_xdp_buffer() · be267277

由 Jason Wang 提交于 12月 27, 2016

After commit 73b62bd0 ("virtio-net:
remove the warning before XDP linearizing"), there's no users for
bpf_warn_invalid_xdp_buffer(), so remove it. This is a revert for
commit f23bc46c.

Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

be267277

ipv4: Namespaceify tcp_tw_reuse knob · 56ab6b93

由 Haishuang Yan 提交于 12月 25, 2016

Different namespaces might have different requirements to reuse
TIME-WAIT sockets for new connections. This might be required in
cases where different namespace applications are in place which
require TIME_WAIT socket connections to be reduced independently
of the host.
Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

56ab6b93

27 12月, 2016 1 次提交

mm: Invalidate DAX radix tree entries only if appropriate · c6dcf52c

由 Jan Kara 提交于 8月 10, 2016

Currently invalidate_inode_pages2_range() and invalidate_mapping_pages()
just delete all exceptional radix tree entries they find. For DAX this
is not desirable as we track cache dirtiness in these entries and when
they are evicted, we may not flush caches although it is necessary. This
can for example manifest when we write to the same block both via mmap
and via write(2) (to different offsets) and fsync(2) then does not
properly flush CPU caches when modification via write(2) was the last
one.

Create appropriate DAX functions to handle invalidation of DAX entries
for invalidate_inode_pages2_range() and invalidate_mapping_pages() and
wire them up into the corresponding mm functions.
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: NJan Kara <jack@suse.cz>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

c6dcf52c

26 12月, 2016 5 次提交

mm: add PageWaiters indicating tasks are waiting for a page bit · 62906027

由 Nicholas Piggin 提交于 12月 25, 2016

Add a new page flag, PageWaiters, to indicate the page waitqueue has
tasks waiting. This can be tested rather than testing waitqueue_active
which requires another cacheline load.

This bit is always set when the page has tasks on page_waitqueue(page),
and is set and cleared under the waitqueue lock. It may be set when
there are no tasks on the waitqueue, which will cause a harmless extra
wakeup check that will clears the bit.

The generic bit-waitqueue infrastructure is no longer used for pages.
Instead, waitqueues are used directly with a custom key type. The
generic code was not flexible enough to have PageWaiters manipulation
under the waitqueue lock (which simplifies concurrency).

This improves the performance of page lock intensive microbenchmarks by
2-3%.

Putting two bits in the same word opens the opportunity to remove the
memory barrier between clearing the lock bit and testing the waiters
bit, after some work on the arch primitives (e.g., ensuring memory
operand widths match and cover both bits).
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

62906027

mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked · 6326fec1

由 Nicholas Piggin 提交于 12月 25, 2016

A page is not added to the swap cache without being swap backed,
so PageSwapBacked mappings can use PG_owner_priv_1 for PageSwapCache.
Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
Acked-by: NHugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6326fec1

ktime: Get rid of ktime_equal() · 1f3a8e49

由 Thomas Gleixner 提交于 12月 25, 2016

No point in going through loops and hoops instead of just comparing the
values.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>

1f3a8e49

ktime: Cleanup ktime_set() usage · 8b0e1953

由 Thomas Gleixner 提交于 12月 25, 2016

ktime_set(S,N) was required for the timespec storage type and is still
useful for situations where a Seconds and Nanoseconds part of a time value
needs to be converted. For anything where the Seconds argument is 0, this
is pointless and can be replaced with a simple assignment.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>

8b0e1953

ktime: Get rid of the union · 2456e855

由 Thomas Gleixner 提交于 12月 25, 2016

ktime is a union because the initial implementation stored the time in
scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
variant for 32bit machines. The Y2038 cleanup removed the timespec variant
and switched everything to scalar nanoseconds. The union remained, but
become completely pointless.

Get rid of the union and just keep ktime_t as simple typedef of type s64.

The conversion was done with coccinelle and some manual mopping up.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>

2456e855

25 12月, 2016 9 次提交

clocksource: Use a plain u64 instead of cycle_t · a5a1d1c2

由 Thomas Gleixner 提交于 12月 21, 2016

There is no point in having an extra type for extra confusion. u64 is
unambiguous.

Conversion was done with the following coccinelle script:

@rem@
@@
-typedef u64 cycle_t;

@fix@
typedef cycle_t;
@@
-cycle_t
+u64
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>

a5a1d1c2

irqchip/armada-xp: Consolidate hotplug state space · 008b69e4

由 Thomas Gleixner 提交于 12月 21, 2016

The mpic is either the main interrupt controller or is cascaded behind a
GIC. The mpic is single instance and the modes are mutually exclusive, so
there is no reason to have seperate cpu hotplug states.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20161221192112.333161745@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

008b69e4

irqchip/gic: Consolidate hotplug state space · 6896bcd1

由 Thomas Gleixner 提交于 12月 21, 2016

Even if both drivers are compiled in only one instance can run on a given
system depending on the available GIC version.

So having seperate hotplug states for them is pointless.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.252416267@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

6896bcd1

coresight/etm3/4x: Consolidate hotplug state space · 36e5b0e3

由 Thomas Gleixner 提交于 12月 21, 2016

Even if both drivers are compiled in only one instance can run on a given
system depending on the available tracer cell.

So having seperate hotplug states for them is pointless.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: http://lkml.kernel.org/r/20161221192112.162765484@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

36e5b0e3

cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions · 530e9b76

由 Thomas Gleixner 提交于 12月 21, 2016

hotcpu_notifier(), cpu_notifier(), __hotcpu_notifier(), __cpu_notifier(),
register_hotcpu_notifier(), register_cpu_notifier(),
__register_hotcpu_notifier(), __register_cpu_notifier(),
unregister_hotcpu_notifier(), unregister_cpu_notifier(),
__unregister_hotcpu_notifier(), __unregister_cpu_notifier()

are unused now. Remove them and all related code.

Remove also the now pointless cpu notifier error injection mechanism. The
states can be executed step by step and error rollback is the same as cpu
down, so any state transition can be tested w/o requiring the notifier
error injection.

Some CPU hotplug states are kept as they are (ab)used for hotplug state
tracking.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161221192112.005642358@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

530e9b76

staging/lustre/libcfs: Convert to hotplug state machine · 7b737965

由 Anna-Maria Gleixner 提交于 12月 21, 2016

Install the callbacks via the state machine. No functional change.
Signed-off-by: NAnna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: devel@driverdev.osuosl.org
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: rt@linutronix.de
Cc: lustre-devel@lists.lustre.org
Link: http://lkml.kernel.org/r/20161202110027.htzzeervzkoc4muv@linutronix.de
Link: http://lkml.kernel.org/r/20161221192111.922872524@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

7b737965

scsi/bnx2i: Convert to hotplug state machine · e210faa2

由 Sebastian Andrzej Siewior 提交于 12月 21, 2016

Install the callbacks via the state machine. No functional change.

This is the minimal fixup so we can remove the hotplug notifier mess
completely.

The real rework of this driver to use work queues is still stuck in
review/testing on the SCSI mailing list.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Chad Dupuis <chad.dupuis@qlogic.com>
Cc: QLogic-Storage-Upstream@qlogic.com
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20161221192111.836895753@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

e210faa2

scsi/bnx2fc: Convert to hotplug state machine · c53b005d

由 Sebastian Andrzej Siewior 提交于 12月 21, 2016

Install the callbacks via the state machine. No functional change.

This is the minimal fixup so we can remove the hotplug notifier mess
completely.

The real rework of this driver to use work queues is still stuck in
review/testing on the SCSI mailing list.
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Chad Dupuis <chad.dupuis@qlogic.com>
Cc: QLogic-Storage-Upstream@qlogic.com
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20161221192111.757309869@linutronix.deSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

c53b005d

Replace <asm/uaccess.h> with <linux/uaccess.h> globally · 7c0f6ba6

由 Linus Torvalds 提交于 12月 24, 2016

This was entirely automated, using the script by Al:

  PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
  sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
        $(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.
Requested-by: NAl Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c0f6ba6

24 12月, 2016 1 次提交

NTB: correct ntb_peer_spad_read for case when callback is not supplied. · 5c43c52d

由 Steven Wahl 提交于 12月 08, 2016

Correct ntb_peer_spad_read for case when callback is not supplied
Signed-off-by: NSteve Wahl <Steve.Wahl@dell.com>
Acked-by: NAllen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: NJon Mason <jdmason@kudzu.us>

5c43c52d

23 12月, 2016 2 次提交

move aio compat to fs/aio.c · c00d2c7e

由 Al Viro 提交于 12月 20, 2016

... and fix the minor buglet in compat io_submit() - native one
kills ioctx as cleanup when put_user() fails.  Get rid of
bogus compat_... in !CONFIG_AIO case, while we are at it - they
should simply fail with ENOSYS, same as for native counterparts.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

c00d2c7e

IB/cma: Fix a race condition in iboe_addr_get_sgid() · fba332b0

由 Bart Van Assche 提交于 12月 19, 2016

Code that dereferences the struct net_device ip_ptr member must be
protected with an in_dev_get() / in_dev_put() pair. Hence insert
calls to these functions.

Fixes: commit 7b85627b ("IB/cma: IBoE (RoCE) IP-based GID addressing")
Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: NMoni Shoua <monis@mellanox.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

fba332b0

21 12月, 2016 5 次提交

ACPI / osl: Remove deprecated acpi_get_table_with_size()/early_acpi_os_unmap_memory() · 8d3523fb

由 Lv Zheng 提交于 12月 14, 2016

Since all users are cleaned up, remove the 2 deprecated APIs due to no
users.
As a Linux variable rather than an ACPICA variable, acpi_gbl_permanent_mmap
is renamed to acpi_permanent_mmap to have a consistent coding style across
entire Linux ACPI subsystem.
Signed-off-by: NLv Zheng <lv.zheng@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

8d3523fb

ACPICA: Tables: Back port acpi_get_table_with_size() and... · 174cc718

由 Lv Zheng 提交于 12月 14, 2016

ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel

ACPICA commit cac6790954d4d752a083e6122220b8a22febcd07

This patch back ports Linux acpi_get_table_with_size() and
early_acpi_os_unmap_memory() into ACPICA upstream to reduce divergences.

The 2 APIs are used by Linux as table management APIs for long time, it
contains a hidden logic that during the early stage, the mapped tables
should be unmapped before the early stage ends.

During the early stage, tables are handled by the following sequence:
 acpi_get_table_with_size();
 parse the table
 early_acpi_os_unmap_memory();
During the late stage, tables are handled by the following sequence:
 acpi_get_table();
 parse the table
Linux uses acpi_gbl_permanent_mmap to distinguish the early stage and the
late stage.

The reasoning of introducing acpi_get_table_with_size() is: ACPICA will
remember the early mapped pointer in acpi_get_table() and Linux isn't able to
prevent ACPICA from using the wrong early mapped pointer during the late
stage as there is no API provided from ACPICA to be an inverse of
acpi_get_table() to forget the early mapped pointer.

But how ACPICA can work with the early/late stage requirement? Inside of
ACPICA, tables are ensured to be remained in "INSTALLED" state during the
early stage, and they are carefully not transitioned to "VALIDATED" state
until the late stage. So the same logic is in fact implemented inside of
ACPICA in a different way. The gap is only that the feature is not provided
to the OSPMs in an accessible external API style.

It then is possible to fix the gap by providing an inverse of
acpi_get_table() from ACPICA, so that the two Linux sequences can be
combined:
 acpi_get_table();
 parse the table
 acpi_put_table();
In order to work easier with the current Linux code, acpi_get_table() and
acpi_put_table() is implemented in a usage counting based style:
 1. When the usage count of the table is increased from 0 to 1, table is
    mapped and .Pointer is set with the mapping address (VALIDATED);
 2. When the usage count of the table is decreased from 1 to 0, .Pointer
    is unset and the mapping address is unmapped (INVALIDATED).
So that we can deploy the new APIs to Linux with minimal effort by just
invoking acpi_get_table() in acpi_get_table_with_size() and invoking
acpi_put_table() in early_acpi_os_unmap_memory(). Lv Zheng.

Link: https://github.com/acpica/acpica/commit/cac67909Signed-off-by: NLv Zheng <lv.zheng@intel.com>
Signed-off-by: NBob Moore <robert.moore@intel.com>
Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>

174cc718

dt: bindings: net: use boolean dt properties for eee broken modes · 308d3165

由 jbrunet 提交于 12月 19, 2016

The patches regarding eee-broken-modes was merged before all people
involved could find an agreement on the best way to move forward.

While we agreed on having a DT property to mark particular modes as broken,
the value used for eee-broken-modes mapped the phy register in very direct
way. Because of this, the concern is that it could be used to implement
configuration policies instead of describing a broken HW.

In the end, having a boolean property for each mode seems to be preferred
over one bit field value mapping the register (too) directly.

Cc: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NJerome Brunet <jbrunet@baylibre.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

308d3165

ratelimit: fix WARN_ON_RATELIMIT return value · 1b011e2f

由 Jiri Slaby 提交于 12月 19, 2016

The macro is to be used similarly as WARN_ON as:

  if (WARN_ON_RATELIMIT(condition, state))
	do_something();

One would expect only 'condition' to affect the 'if', but
WARN_ON_RATELIMIT does internally only:

  WARN_ON((condition) && __ratelimit(state))

So the 'if' is affected by the ratelimiting state too.  Fix this by
returning 'condition' in any case.

Note that nobody uses WARN_ON_RATELIMIT yet, so there is nothing to
worry about.  But I was about to use it and was a bit surprised.

Link: http://lkml.kernel.org/r/20161215093224.23126-1-jslaby@suse.czSigned-off-by: NJiri Slaby <jslaby@suse.cz>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1b011e2f

ima: on soft reboot, save the measurement list · 7b8589cc

由 Mimi Zohar 提交于 12月 19, 2016

The TPM PCRs are only reset on a hard reboot.  In order to validate a
TPM's quote after a soft reboot (eg.  kexec -e), the IMA measurement
list of the running kernel must be saved and restored on boot.

This patch uses the kexec buffer passing mechanism to pass the
serialized IMA binary_runtime_measurements to the next kernel.

Link: http://lkml.kernel.org/r/1480554346-29071-7-git-send-email-zohar@linux.vnet.ibm.comSigned-off-by: NThiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: NDmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: Andreas Steffen <andreas.steffen@strongswan.org>
Cc: Josh Sklar <sklar@linux.vnet.ibm.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7b8589cc

20 12月, 2016 2 次提交
- T
  NFS: Clean up nfs_attribute_timeout() · 187e593d
  由 Trond Myklebust 提交于 12月 16, 2016
```
It can be made static.
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  187e593d
- T
  NFS: Remove unused function nfs_revalidate_inode_rcu() · 3f642a13
  由 Trond Myklebust 提交于 12月 16, 2016
```
Signed-off-by: NTrond Myklebust <trond.myklebust@primarydata.com>
```
  3f642a13
19 12月, 2016 1 次提交

platform/x86: Add Whiskey Cove PMIC TMU support · 957ae509

由 Nilesh Bacchewar 提交于 11月 07, 2016

This adds TMU (Time Management Unit) support for Intel BXT platform.
It enables the alarm wake-up functionality in the TMU unit of Whiskey Cove
PMIC.
Signed-off-by: NNilesh Bacchewar <nilesh.bacchewar@intel.com>
Reviewed-by: NMika Westerberg <mika.westerberg@linux.intel.com>
[andy: resolve merge conflict in Kconfig]
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>

957ae509

18 12月, 2016 7 次提交

bpf: fix overflow in prog accounting · 5ccb071e

由 Daniel Borkmann 提交于 12月 18, 2016

Commit aaac3ba9 ("bpf: charge user for creation of BPF maps and
programs") made a wrong assumption of charging against prog->pages.
Unlike map->pages, prog->pages are still subject to change when we
need to expand the program through bpf_prog_realloc().

This can for example happen during verification stage when we need to
expand and rewrite parts of the program. Should the required space
cross a page boundary, then prog->pages is not the same anymore as
its original value that we used to bpf_prog_charge_memlock() on. Thus,
we'll hit a wrap-around during bpf_prog_uncharge_memlock() when prog
is freed eventually. I noticed this that despite having unlimited
memlock, programs suddenly refused to load with EPERM error due to
insufficient memlock.

There are two ways to fix this issue. One would be to add a cached
variable to struct bpf_prog that takes a snapshot of prog->pages at the
time of charging. The other approach is to also account for resizes. I
chose to go with the latter for a couple of reasons: i) We want accounting
rather to be more accurate instead of further fooling limits, ii) adding
yet another page counter on struct bpf_prog would also be a waste just
for this purpose. We also do want to charge as early as possible to
avoid going into the verifier just to find out later on that we crossed
limits. The only place that needs to be fixed is bpf_prog_realloc(),
since only here we expand the program, so we try to account for the
needed delta and should we fail, call-sites check for outcome anyway.
On cBPF to eBPF migrations, we don't grab a reference to the user as
they are charged differently. With that in place, my test case worked
fine.

Fixes: aaac3ba9 ("bpf: charge user for creation of BPF maps and programs")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ccb071e

bpf: dynamically allocate digest scratch buffer · aafe6ae9

由 Daniel Borkmann 提交于 12月 18, 2016

Geert rightfully complained that 7bd509e3 ("bpf: add prog_digest
and expose it via fdinfo/netlink") added a too large allocation of
variable 'raw' from bss section, and should instead be done dynamically:

  # ./scripts/bloat-o-meter kernel/bpf/core.o.1 kernel/bpf/core.o.2
  add/remove: 3/0 grow/shrink: 0/0 up/down: 33291/0 (33291)
  function                                     old     new   delta
  raw                                            -   32832  +32832
  [...]

Since this is only relevant during program creation path, which can be
considered slow-path anyway, lets allocate that dynamically and be not
implicitly dependent on verifier mutex. Move bpf_prog_calc_digest() at
the beginning of replace_map_fd_with_map_ptr() and also error handling
stays straight forward.
Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aafe6ae9

block: Remove unused member (busy) from struct blk_queue_tag · e8465447

由 Ritesh Harjani 提交于 12月 16, 2016

Signed-off-by: NRitesh Harjani <riteshh@codeaurora.org>
Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

e8465447

net: xdp: add invalid buffer warning · f23bc46c

由 John Fastabend 提交于 12月 15, 2016

This adds a warning for drivers to use when encountering an invalid
buffer for XDP. For normal cases this should not happen but to catch
this in virtual/qemu setups that I may not have expected from the
emulation layer having a standard warning is useful.
Signed-off-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f23bc46c

irnet: ppp: move IRNET_MINOR to include/linux/miscdevice.h · 24c946cc

由 LABBE Corentin 提交于 12月 15, 2016

This patch move the define for IRNET_MINOR to include/linux/miscdevice.h
It is better that all minor number definitions are in the same place.
Signed-off-by: NCorentin Labbe <clabbe.montjoie@gmail.com>
Acked-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

24c946cc

bpf: cgroup: annotate pointers in struct cgroup_bpf with __rcu · dcdc43d6

由 Daniel Mack 提交于 12月 15, 2016

The member 'effective' in 'struct cgroup_bpf' is protected by RCU.
Annotate it accordingly to squelch a sparse warning.
Signed-off-by: NDaniel Mack <daniel@zonque.org>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dcdc43d6

inet: Fix get port to handle zero port number with soreuseport set · 0643ee4f

由 Tom Herbert 提交于 12月 14, 2016

A user may call listen with binding an explicit port with the intent
that the kernel will assign an available port to the socket. In this
case inet_csk_get_port does a port scan. For such sockets, the user may
also set soreuseport with the intent a creating more sockets for the
port that is selected. The problem is that the initial socket being
opened could inadvertently choose an existing and unreleated port
number that was already created with soreuseport.

This patch adds a boolean parameter to inet_bind_conflict that indicates
rather soreuseport is allowed for the check (in addition to
sk->sk_reuseport). In calls to inet_bind_conflict from inet_csk_get_port
the argument is set to true if an explicit port is being looked up (snum
argument is nonzero), and is false if port scan is done.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0643ee4f

17 12月, 2016 2 次提交

net: macb: Added PCI wrapper for Platform Driver. · 83a77e9e

由 Bartosz Folta 提交于 12月 14, 2016

There are hardware PCI implementations of Cadence GEM network
controller. This patch will allow to use such hardware with reuse of
existing Platform Driver.
Signed-off-by: NBartosz Folta <bfolta@cadence.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83a77e9e

x86/mpx: Move bd_addr to mm_context_t · cb02de96

由 Mark Rutland 提交于 12月 16, 2016

Currently bd_addr lives in mm_struct, which is otherwise architecture
independent. Architecture-specific data is supposed to live within
mm_context_t (itself contained in mm_struct).

Other x86-specific context like the pkey accounting data lives in
mm_context_t, and there's no readon the MPX data can't also live there.
So as to keep the arch-specific data togather, and to set a good example
for others, this patch moves bd_addr into x86's mm_context_t.
Signed-off-by: NMark Rutland <mark.rutland@arm.com>
Acked-by: NDave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1481892055-24596-1-git-send-email-mark.rutland@arm.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

cb02de96

16 12月, 2016 1 次提交

vfs: call vfs_clone_file_range() under freeze protection · 031a072a

由 Amir Goldstein 提交于 9月 23, 2016

Move sb_start_write()/sb_end_write() out of the vfs helper and up into the
ioctl handler.
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NMiklos Szeredi <mszeredi@redhat.com>

031a072a