提交 · 48bda43eabb8d086204f543cf8bbad696b8c6391 · openanolis / cloud-kernel

14 5月, 2018 3 次提交

softirq/s390: Move default mutators of overwritten softirq mask to s390 · 48bda43e

由 Frederic Weisbecker 提交于 5月 08, 2018

s390 is now the last architecture that entirely overwrites
local_softirq_pending() and uses the according default definitions of
set_softirq_pending() and or_softirq_pending().

Just move these to s390 to debloat the generic code complexity.
Suggested-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NFrederic Weisbecker <frederic@kernel.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/1525786706-22846-12-git-send-email-frederic@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

48bda43e

softirq/core: Consolidate default local_softirq_pending() implementations · 0fd7d862

由 Frederic Weisbecker 提交于 5月 08, 2018

Consolidate and optimize default softirq mask API implementations.
Per-CPU operations are expected to be faster and a few architectures
already rely on them to implement local_softirq_pending() and related
accessors/mutators. Those will be migrated to the new generic code.
Signed-off-by: NFrederic Weisbecker <frederic@kernel.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/1525786706-22846-6-git-send-email-frederic@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

0fd7d862

softirq/core: Turn default irq_cpustat_t to standard per-cpu · 0f6f47ba

由 Frederic Weisbecker 提交于 5月 08, 2018

In order to optimize and consolidate softirq mask accesses, let's
convert the default irq_cpustat_t implementation to per-CPU standard API.
Signed-off-by: NFrederic Weisbecker <frederic@kernel.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/1525786706-22846-5-git-send-email-frederic@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

0f6f47ba

13 5月, 2018 5 次提交

irqchip/gic-v3: Add support for Message Based Interrupts as an MSI controller · 50528752

由 Marc Zyngier 提交于 5月 08, 2018

GICv3 offers the possibility to signal SPIs using a pair of doorbells
(SETPI, CLRSPI) under the name of Message Based Interrupts (MBI).
They can be used as either traditional (edge) MSIs, or the more exotic
level-triggered flavour.

Let's implement support for platform MSI, which is the original intent
for this feature.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-8-marc.zyngier@arm.com

50528752

irqdomain: Let irq_find_host default to DOMAIN_BUS_WIRED · 64619343

由 Marc Zyngier 提交于 5月 08, 2018

At the beginning of times, irq_find_host() was simple. Each device node
implemented at most one irq domain, and we were happy. Over time, things
have become more complex, and we now have nodes implementing a plurality
of domains, tagged by "bus_token".

Crutially, users of irq_find_host() all expect the most basic domain
to be returned, and not any other domain such as a bus-specific MSI
domain.

So let's change irq_find_host() to first look for a DOMAIN_BUS_WIRED
domain, and only if this fails fallback to DOMAIN_BUS_ANY. Note that
this is consistent with what irq_create_fwspec_mapping is already
doing, see 530cbe10 ("irqdomain: Allow domain lookup with
DOMAIN_BUS_WIRED token").
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-6-marc.zyngier@arm.com

64619343

dma-iommu: Fix compilation when !CONFIG_IOMMU_DMA · 8a22a3e1

由 Marc Zyngier 提交于 5月 08, 2018

Inclusion of include/dma-iommu.h when CONFIG_IOMMU_DMA is not selected
results in the following splat:

In file included from drivers/irqchip/irq-gic-v3-mbi.c:20:0:
./include/linux/dma-iommu.h:95:69: error: unknown type name ‘dma_addr_t’
 static inline int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base)
                                                                     ^~~~~~~~~~
./include/linux/dma-iommu.h:108:74: warning: ‘struct list_head’ declared inside parameter list will not be visible outside of this definition or declaration
 static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
                                                                          ^~~~~~~~~
scripts/Makefile.build:312: recipe for target 'drivers/irqchip/irq-gic-v3-mbi.o' failed

Fix it by including linux/types.h.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-5-marc.zyngier@arm.com

8a22a3e1

genirq/msi: Limit level-triggered MSI to platform devices · 6988e0e0

由 Marc Zyngier 提交于 5月 08, 2018

Nobody would be insane enough to try and use level triggered
MSIs on PCI, but let's make sure it doesn't happen. Also,
let's mandate that the irqchip backing the platform MSI domain
is providing the IRQCHIP_SUPPORTS_LEVEL_MSI flag.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-3-marc.zyngier@arm.com

6988e0e0

genirq/msi: Allow level-triggered MSIs to be exposed by MSI providers · 0be8153c

由 Marc Zyngier 提交于 5月 08, 2018

So far, MSIs have been used to signal edge-triggered interrupts, as
a write is a good model for an edge (you can't "unwrite" something).
On the other hand, routing zillions of wires in an SoC because you
need level interrupts is a bit extreme.

People have come up with a variety of schemes to support this, which
involves sending two messages: one to signal the interrupt, and one
to clear it. Since the kernel cannot represent this, we've ended up
with side-band mechanisms that are pretty awful.

Instead, let's acknoledge the requirement, and ensure that, under the
right circumstances, the irq_compose_msg and irq_write_msg can take
as a parameter an array of two messages instead of a pointer to a
single one. We also add some checking that the compose method only
clobbers the second message if the MSI domain has been created with
the MSI_FLAG_LEVEL_CAPABLE flags.
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lkml.kernel.org/r/20180508121438.11301-2-marc.zyngier@arm.com

0be8153c

12 5月, 2018 2 次提交

rbtree: include rcu.h · 2075b16e

由 Sebastian Andrzej Siewior 提交于 5月 11, 2018

Since commit c1adf200 ("Introduce rb_replace_node_rcu()")
rbtree_augmented.h uses RCU related data structures but does not include
the header file. It works as long as it gets somehow included before
that and fails otherwise.

Link: http://lkml.kernel.org/r/20180504103159.19938-1-bigeasy@linutronix.deSigned-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2075b16e

mm, oom: fix concurrent munlock and oom reaper unmap, v3 · 27ae357f

由 David Rientjes 提交于 5月 11, 2018

Since exit_mmap() is done without the protection of mm->mmap_sem, it is
possible for the oom reaper to concurrently operate on an mm until
MMF_OOM_SKIP is set.

This allows munlock_vma_pages_all() to concurrently run while the oom
reaper is operating on a vma.  Since munlock_vma_pages_range() depends
on clearing VM_LOCKED from vm_flags before actually doing the munlock to
determine if any other vmas are locking the same memory, the check for
VM_LOCKED in the oom reaper is racy.

This is especially noticeable on architectures such as powerpc where
clearing a huge pmd requires serialize_against_pte_lookup().  If the pmd
is zapped by the oom reaper during follow_page_mask() after the check
for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a
kernel oops.

Fix this by manually freeing all possible memory from the mm before
doing the munlock and then setting MMF_OOM_SKIP.  The oom reaper can not
run on the mm anymore so the munlock is safe to do in exit_mmap().  It
also matches the logic that the oom reaper currently uses for
determining when to set MMF_OOM_SKIP itself, so there's no new risk of
excessive oom killing.

This issue fixes CVE-2018-1000200.

Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.corp.google.com
Fixes: 21292580 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Suggested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>	[4.14+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

27ae357f

11 5月, 2018 3 次提交

bonding: send learning packets for vlans on slave · 21706ee8

由 Debabrata Banerjee 提交于 5月 09, 2018

There was a regression at some point from the intended functionality of
commit f60c3704 ("bonding: Fix alb mode to only use first level
vlans.")

Given the return value vlan_get_encap_level() we need to store the nest
level of the bond device, and then compare the vlan's encap level to
this. Without this, this check always fails and learning packets are
never sent.

In addition, this same commit caused a regression in the behavior of
balance_alb, which requires learning packets be sent for all interfaces
using the slave's mac in order to load balance properly. For vlan's
that have not set a user mac, we can send after checking one bit.
Otherwise we need send the set mac, albeit defeating rx load balancing
for that vlan.
Signed-off-by: NDebabrata Banerjee <dbanerje@akamai.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

21706ee8

rxrpc: Trace UDP transmission failure · 6b47fe1d

由 David Howells 提交于 5月 10, 2018

Add a tracepoint to log transmission failure from the UDP transport socket
being used by AF_RXRPC.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

6b47fe1d

rxrpc: Add a tracepoint to log ICMP/ICMP6 and error messages · 494337c9

由 David Howells 提交于 5月 10, 2018

Add a tracepoint to log received ICMP/ICMP6 events and other error
messages.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

494337c9

10 5月, 2018 1 次提交

libceph: add osd_req_op_extent_osd_data_bvecs() · 0010f705

由 Ilya Dryomov 提交于 5月 04, 2018

... and store num_bvecs for client code's convenience.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>

0010f705

08 5月, 2018 1 次提交

net: flow_dissector: fix typo 'can by' to 'can be' · 53bc017f

由 Wolfram Sang 提交于 5月 06, 2018

Signed-off-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53bc017f

07 5月, 2018 1 次提交

mac80211: fix kernel-doc "bad line" warning · d1361b32

由 Randy Dunlap 提交于 4月 26, 2018

Fix 88 instances of a kernel-doc warning:
  ../include/net/mac80211.h:2083: warning: bad line:  >
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: linux-wireless@vger.kernel.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

d1361b32

05 5月, 2018 1 次提交

net: phy: broadcom: add support for BCM89610 PHY · 23b83922

由 Bhadram Varka 提交于 5月 02, 2018

It adds support for BCM89610 (Single-Port 10/100/1000BASE-T)
transceiver which is used in P3310 Tegra186 platform.
Signed-off-by: NBhadram Varka <vbhadram@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

23b83922

04 5月, 2018 2 次提交

MAINTAINERS & files: Canonize the e-mails I use at files · 32590819

由 Mauro Carvalho Chehab 提交于 4月 25, 2018

From now on, I'll start using my @kernel.org as my development e-mail.

As such, let's remove the entries that point to the old
mchehab@s-opensource.com at MAINTAINERS file.

For the files written with a copyright with mchehab@s-opensource,
let's keep Samsung on their names, using mchehab+samsung@kernel.org,
in order to keep pointing to my employer, with sponsors the work.

For the files written before I join Samsung (on July, 4 2013),
let's just use mchehab@kernel.org.

For bug reports, we can simply point to just kernel.org, as
this will reach my mchehab+samsung inbox anyway.
Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: NBrian Warner <brian.warner@samsung.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>

32590819

sched/core: Introduce set_special_state() · b5bf9a90

由 Peter Zijlstra 提交于 4月 30, 2018

Gaurav reported a perceived problem with TASK_PARKED, which turned out
to be a broken wait-loop pattern in __kthread_parkme(), but the
reported issue can (and does) in fact happen for states that do not do
condition based sleeps.

When the 'current->state = TASK_RUNNING' store of a previous
(concurrent) try_to_wake_up() collides with the setting of a 'special'
sleep state, we can loose the sleep state.

Normal condition based wait-loops are immune to this problem, but for
sleep states that are not condition based are subject to this problem.

There already is a fix for TASK_DEAD. Abstract that and also apply it
to TASK_STOPPED and TASK_TRACED, both of which are also without
condition based wait-loop.
Reported-by: NGaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

b5bf9a90

03 5月, 2018 3 次提交

bdi: wake up concurrent wb_shutdown() callers. · 8236b0ae

由 Tetsuo Handa 提交于 5月 02, 2018

syzbot is reporting hung tasks at wait_on_bit(WB_shutting_down) in
wb_shutdown() [1]. This seems to be because commit 5318ce7d ("bdi:
Shutdown writeback on all cgwbs in cgwb_bdi_destroy()") forgot to call
wake_up_bit(WB_shutting_down) after clear_bit(WB_shutting_down).

Introduce a helper function clear_and_wake_up_bit() and use it, in order
to avoid similar errors in future.

[1] https://syzkaller.appspot.com/bug?id=b297474817af98d5796bc544e1bb806fc3da0e5eSigned-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Nsyzbot <syzbot+c0cf869505e03bdf1a24@syzkaller.appspotmail.com>
Fixes: 5318ce7d ("bdi: Shutdown writeback on all cgwbs in cgwb_bdi_destroy()")
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

8236b0ae

kthread, sched/wait: Fix kthread_parkme() completion issue · 85f1abe0

由 Peter Zijlstra 提交于 5月 01, 2018

Even with the wait-loop fixed, there is a further issue with
kthread_parkme(). Upon hotplug, when we do takedown_cpu(),
smpboot_park_threads() can return before all those threads are in fact
blocked, due to the placement of the complete() in __kthread_parkme().

When that happens, sched_cpu_dying() -> migrate_tasks() can end up
migrating such a still runnable task onto another CPU.

Normally the task will have hit schedule() and gone to sleep by the
time we do kthread_unpark(), which will then do __kthread_bind() to
re-bind the task to the correct CPU.

However, when we loose the initial TASK_PARKED store to the concurrent
wakeup issue described previously, do the complete(), get migrated, it
is possible to either:

 - observe kthread_unpark()'s clearing of SHOULD_PARK and terminate
   the park and set TASK_RUNNING, or

 - __kthread_bind()'s wait_task_inactive() to observe the competing
   TASK_RUNNING store.

Either way the WARN() in __kthread_bind() will trigger and fail to
correctly set the CPU affinity.

Fix this by only issuing the complete() when the kthread has scheduled
out. This does away with all the icky 'still running' nonsense.

The alternative is to promote TASK_PARKED to a special state, this
guarantees wait_task_inactive() cannot observe a 'stale' TASK_RUNNING
and we'll end up doing the right thing, but this preserves the whole
icky business of potentially migating the still runnable thing.
Reported-by: NGaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

85f1abe0

ipv6: Revert "ipv6: Allow non-gateway ECMP for IPv6" · 30ca22e4

由 Ido Schimmel 提交于 5月 02, 2018

This reverts commit edd7ceb7 ("ipv6: Allow non-gateway ECMP for
IPv6").

Eric reported a division by zero in rt6_multipath_rebalance() which is
caused by above commit that considers identical local routes to be
siblings. The division by zero happens because a nexthop weight is not
set for local routes.

Revert the commit as it does not fix a bug and has side effects.

To reproduce:

# ip -6 address add 2001:db8::1/64 dev dummy0
# ip -6 address add 2001:db8::1/64 dev dummy1

Fixes: edd7ceb7 ("ipv6: Allow non-gateway ECMP for IPv6")
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Reported-by: NEric Dumazet <eric.dumazet@gmail.com>
Tested-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30ca22e4

02 5月, 2018 5 次提交

Revert "vhost: make msg padding explicit" · c818aa88

由 Michael S. Tsirkin 提交于 5月 02, 2018

This reverts commit 93c0d549c4c5a7382ad70de6b86610b7aae57406.

Unfortunately the padding will break 32 bit userspace.
Ouch. Need to add some compat code, revert for now.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c818aa88

net/tls: Don't recursively call push_record during tls_write_space callbacks · c212d2c7

由 Dave Watson 提交于 5月 01, 2018

It is reported that in some cases, write_space may be called in
do_tcp_sendpages, such that we recursively invoke do_tcp_sendpages again:

[  660.468802]  ? do_tcp_sendpages+0x8d/0x580
[  660.468826]  ? tls_push_sg+0x74/0x130 [tls]
[  660.468852]  ? tls_push_record+0x24a/0x390 [tls]
[  660.468880]  ? tls_write_space+0x6a/0x80 [tls]
...

tls_push_sg already does a loop over all sending sg's, so ignore
any tls_write_space notifications until we are done sending.
We then have to call the previous write_space to wake up
poll() waiters after we are done with the send loop.
Reported-by: NAndre Tomt <andre@tomt.net>
Signed-off-by: NDave Watson <davejwatson@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c212d2c7

ipv6: Allow non-gateway ECMP for IPv6 · edd7ceb7

由 Thomas Winter 提交于 5月 01, 2018

It is valid to have static routes where the nexthop
is an interface not an address such as tunnels.
For IPv4 it was possible to use ECMP on these routes
but not for IPv6.
Signed-off-by: NThomas Winter <Thomas.Winter@alliedtelesis.co.nz>
Cc: David Ahern <dsahern@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

edd7ceb7

vhost: make msg padding explicit · de08481a

由 Michael S. Tsirkin 提交于 4月 27, 2018

There's a 32 bit hole just after type. It's best to
give it a name, this way compiler is forced to initialize
it with rest of the structure.
Reported-by: NKevin Easton <kevin@guarana.org>
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

de08481a

sunrpc: Fix latency trace point crashes · 98eb6cf2

由 Chuck Lever 提交于 5月 01, 2018

If the rpc_task survived longer than the transport, task->tk_xprt
points to freed memory by the time rpc_count_iostats_metrics runs.
Replace the references to task->tk_xprt with references to the
task's tk_client.

Reported-by: syzbot+27db1f90e2b972a5f2d3@syzkaller.appspotmail.com
Fixes: 40bf7eb3 ('sunrpc: Add static trace point to report ...')
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

98eb6cf2

29 4月, 2018 1 次提交

<linux/stringhash.h>: fix end_name_hash() for 64bit long · 19b9ad67

由 Amir Goldstein 提交于 2月 05, 2018

The comment claims that this helper will try not to loose bits, but for
64bit long it looses the high bits before hashing 64bit long into 32bit
int.  Use the helper hash_long() to do the right thing for 64bit long.
For 32bit long, there is no change.

All the callers of end_name_hash() either assign the result to
qstr->hash, which is u32 or return the result as an int value (e.g.
full_name_hash()).  Change the helper return type to int to conform to
its users.

[ It took me a while to apply this, because my initial reaction to it
  was - incorrectly - that it could make for slower code.

  After having looked more at it, I take back all my complaints about
  the patch, Amir was right and I was mis-reading things or just being
  stupid.

  I also don't worry too much about the possible performance impact of
  this on 64-bit, since most architectures that actually care about
  performance end up not using this very much (the dcache code is the
  most performance-critical, but the word-at-a-time case uses its own
  hashing anyway).

  So this ends up being mostly used for filesystems that do their own
  degraded hashing (usually because they want a case-insensitive
  comparison function).

  A _tiny_ worry remains, in that not everybody uses DCACHE_WORD_ACCESS,
  and then this potentially makes things more expensive on 64-bit
  architectures with slow or lacking multipliers even for the normal
  case.

  That said, realistically the only such architecture I can think of is
  PA-RISC. Nobody really cares about performance on that, it's more of a
  "look ma, I've got warts^W an odd machine" platform.

  So the patch is fine, and all my initial worries were just misplaced
  from not looking at this properly.   - Linus ]
Signed-off-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

19b9ad67

28 4月, 2018 1 次提交

x86/headers/UAPI: Move DISABLE_EXITS KVM capability bits to the UAPI · 5e62493f

由 KarimAllah Ahmed 提交于 4月 17, 2018

Move DISABLE_EXITS KVM capability bits to the UAPI just like the rest of
capabilities.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>

5e62493f

27 4月, 2018 5 次提交

KVM: arm/arm64: vgic: Fix source vcpu issues for GICv2 SGI · 53692908

由 Marc Zyngier 提交于 4月 18, 2018

Now that we make sure we don't inject multiple instances of the
same GICv2 SGI at the same time, we've made another bug more
obvious:

If we exit with an active SGI, we completely lose track of which
vcpu it came from. On the next entry, we restore it with 0 as a
source, and if that wasn't the right one, too bad. While this
doesn't seem to trouble GIC-400, the architectural model gets
offended and doesn't deactivate the interrupt on EOI.

Another connected issue is that we will happilly make pending
an interrupt from another vcpu, overriding the above zero with
something that is just as inconsistent. Don't do that.

The final issue is that we signal a maintenance interrupt when
no pending interrupts are present in the LR. Assuming we've fixed
the two issues above, we end-up in a situation where we keep
exiting as soon as we've reached the active state, and not be
able to inject the following pending.

The fix comes in 3 parts:
- GICv2 SGIs have their source vcpu saved if they are active on
  exit, and restored on entry
- Multi-SGIs cannot go via the Pending+Active state, as this would
  corrupt the source field
- Multi-SGIs are converted to using MI on EOI instead of NPIE

Fixes: 16ca6a60 ("KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid")
Reported-by: NMark Rutland <mark.rutland@arm.com>
Tested-by: NMark Rutland <mark.rutland@arm.com>
Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>

53692908

usb: gadget: composite Allow for larger configuration descriptors · ed769520

由 Joel Pepper 提交于 4月 26, 2018

The composite framework allows us to create gadgets composed from many
different functions, which need to fit into a single configuration
descriptor.

Some functions (like uvc) can produce configuration descriptors upwards
of 2500 bytes on their own.

This patch increases the limit from 1024 bytes to 4096.
Signed-off-by: NJoel Pepper <joel.pepper@rwth-aachen.de>
Signed-off-by: NFelipe Balbi <felipe.balbi@linux.intel.com>

ed769520

genirq/irq_sim: Use the SPDX license identifier in the header · b5c5f395

由 Bartosz Golaszewski 提交于 4月 26, 2018

Use C-style comment for the identifier as per
Documentation/process/license-rules.rst and remove the license boilerplate.
Signed-off-by: NBartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20180426200747.8344-2-brgl@bgdev.pl

b5c5f395

net/mlx5: Fix mlx5_get_vector_affinity function · 6082d9c9

由 Israel Rukshin 提交于 4月 12, 2018

Adding the vector offset when calling to mlx5_vector2eqn() is wrong.
This is because mlx5_vector2eqn() checks if EQ index is equal to vector number
and the fact that the internal completion vectors that mlx5 allocates
don't get an EQ index.

The second problem here is that using effective_affinity_mask gives the same
CPU for different vectors.
This leads to unmapped queues when calling it from blk_mq_rdma_map_queues().
This doesn't happen when using affinity_hint mask.

Fixes: 2572cf57 ("mlx5: fix mlx5_get_vector_affinity to start from completion vector 0")
Fixes: 05e0cc84 ("net/mlx5: Fix get vector affinity helper function")
Signed-off-by: NIsrael Rukshin <israelr@mellanox.com>
Reviewed-by: NMax Gurtovoy <maxg@mellanox.com>
Reviewed-by: NSagi Grimberg <sagi@grimberg.me>

6082d9c9

tracing: initcall: Ordered comparison of function pointers · 0566e40c

由 Rishabh Bhatnagar 提交于 4月 25, 2018

Using initcall_t in the __field macro generates the following warning
with clang version 6.0:

include/trace/events/initcall.h:34:3: warning: ordered comparison of
function pointers ('initcall_t' (aka 'int (*)(void)') and 'initcall_t')

__field macro expands to __field_ext macro which does is_signed_type
check on the type argument. Since initcall_t is defined as a function
pointer, using it as the type in the __field macro, leads to an ordered
comparison of function pointer warning, inside the check. Using
__field_struct macro avoids the issue.

Link: http://lkml.kernel.org/r/1524699755-29388-1-git-send-email-rishabhb@codeaurora.orgSigned-off-by: NRishabh Bhatnagar <rishabhb@codeaurora.org>
[ Added comment to why we are using field_struct() ]
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

0566e40c

26 4月, 2018 4 次提交

blk-mq: fix sysfs inflight counter · bf0ddaba

由 Omar Sandoval 提交于 4月 26, 2018

When the blk-mq inflight implementation was added, /proc/diskstats was
converted to use it, but /sys/block/$dev/inflight was not. Fix it by
adding another helper to count in-flight requests by data direction.

Fixes: f299b7c7 ("blk-mq: provide internal in-flight variant")
Signed-off-by: NOmar Sandoval <osandov@fb.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bf0ddaba

Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME · a3ed0e43

由 Thomas Gleixner 提交于 4月 25, 2018

Revert commits

92af4dcb ("tracing: Unify the "boot" and "mono" tracing clocks")
127bfa5f ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior")
7250a404 ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior")
d6c7270e ("timekeeping: Remove boot time specific code")
f2d6fdbf ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior")
d6ed449a ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock")
72199320 ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock")

As stated in the pull request for the unification of CLOCK_MONOTONIC and
CLOCK_BOOTTIME, it was clear that we might have to revert the change.

As reported by several folks systemd and other applications rely on the
documented behaviour of CLOCK_MONOTONIC on Linux and break with the above
changes. After resume daemons time out and other timeout related issues are
observed. Rafael compiled this list:

* systemd kills daemons on resume, after >WatchdogSec seconds
  of suspending (Genki Sky).  [Verified that that's because systemd uses
  CLOCK_MONOTONIC and expects it to not include the suspend time.]

* systemd-journald misbehaves after resume:
  systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal
corrupted or uncleanly shut down, renaming and replacing.
  (Mike Galbraith).

* NetworkManager reports "networking disabled" and networking is broken
  after resume 50% of the time (Pavel).  [May be because of systemd.]

* MATE desktop dims the display and starts the screensaver right after
  system resume (Pavel).

* Full system hang during resume (me).  [May be due to systemd or NM or both.]

That happens on debian and open suse systems.

It's sad, that these problems were neither catched in -next nor by those
folks who expressed interest in this change.
Reported-by: NRafael J. Wysocki <rjw@rjwysocki.net>
Reported-by: Genki Sky <sky@genki.is>,
Reported-by: NPavel Machek <pavel@ucw.cz>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kevin Easton <kevin@guarana.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salyzyn <salyzyn@android.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>

a3ed0e43

remoteproc: fix crashed parameter logic on stop call · fcd58037

由 Arnaud Pouliquen 提交于 4月 10, 2018

Fix rproc_add_subdev parameter name and inverse the crashed logic.

Fixes: 880f5b38 ("remoteproc: Pass type of shutdown to subdev remove")
Reviewed-by: NAlex Elder <elder@linaro.org>
Signed-off-by: NArnaud Pouliquen <arnaud.pouliquen@st.com>
Signed-off-by: NBjorn Andersson <bjorn.andersson@linaro.org>

fcd58037

virtio: add ability to iterate over vqs · 24a7e4d2

由 Michael S. Tsirkin 提交于 4月 20, 2018

For cleanup it's helpful to be able to simply scan all vqs and discard
all data. Add an iterator to do that.

Cc: stable@vger.kernel.org
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>

24a7e4d2

25 4月, 2018 2 次提交

block: mq: Add some minor doc for core structs · fe644072

由 Linus Walleij 提交于 4月 20, 2018

As it came up in discussion on the mailing list that the semantic
meaning of 'blk_mq_ctx' and 'blk_mq_hw_ctx' isn't completely
obvious to everyone, let's add some minimal kerneldoc for a
starter.
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

fe644072

ALSA: control: Hardening for potential Spectre v1 · 088e861e

由 Takashi Iwai 提交于 4月 24, 2018

As recently Smatch suggested, a few places in ALSA control core codes
may expand the array directly from the user-space value with
speculation:

sound/core/control.c:1003 snd_ctl_elem_lock() warn: potential spectre issue 'kctl->vd'
sound/core/control.c:1031 snd_ctl_elem_unlock() warn: potential spectre issue 'kctl->vd'
sound/core/control.c:844 snd_ctl_elem_info() warn: potential spectre issue 'kctl->vd'
sound/core/control.c:891 snd_ctl_elem_read() warn: potential spectre issue 'kctl->vd'
sound/core/control.c:939 snd_ctl_elem_write() warn: potential spectre issue 'kctl->vd'

Although all these seem doing only the first load without further
reference, we may want to stay in a safer side, so hardening with
array_index_nospec() would still make sense.

In this patch, we put array_index_nospec() to the common
snd_ctl_get_ioff*() helpers instead of each caller. These helpers are
also referred from some drivers, too, and basically all usages are to
calculate the array index from the user-space value, hence it's better
to cover there.

BugLink: https://marc.info/?l=linux-kernel&m=152411496503418&w=2Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NTakashi Iwai <tiwai@suse.de>

088e861e

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功