提交 · e64e4018d572710c44f42c923d4ac059f0a23320 · openeuler / Kernel

02 6月, 2018 1 次提交

bpf: fix uapi hole for 32 bit compat applications · 36f9814a

由 Daniel Borkmann 提交于 6月 02, 2018

In 64 bit, we have a 4 byte hole between ifindex and netns_dev in the
case of struct bpf_map_info but also struct bpf_prog_info. In net-next
commit b85fab0e ("bpf: Add gpl_compatible flag to struct bpf_prog_info")
added a bitfield into it to expose some flags related to programs. Thus,
add an unnamed __u32 bitfield for both so that alignment keeps the same
in both 32 and 64 bit cases, and can be naturally extended from there
as in b85fab0e.

Before:

  # file test.o
  test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped
  # pahole test.o
  struct bpf_map_info {
	__u32                      type;                 /*     0     4 */
	__u32                      id;                   /*     4     4 */
	__u32                      key_size;             /*     8     4 */
	__u32                      value_size;           /*    12     4 */
	__u32                      max_entries;          /*    16     4 */
	__u32                      map_flags;            /*    20     4 */
	char                       name[16];             /*    24    16 */
	__u32                      ifindex;              /*    40     4 */
	__u64                      netns_dev;            /*    44     8 */
	__u64                      netns_ino;            /*    52     8 */

	/* size: 64, cachelines: 1, members: 10 */
	/* padding: 4 */
  };

After (same as on 64 bit):

  # file test.o
  test.o: ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped
  # pahole test.o
  struct bpf_map_info {
	__u32                      type;                 /*     0     4 */
	__u32                      id;                   /*     4     4 */
	__u32                      key_size;             /*     8     4 */
	__u32                      value_size;           /*    12     4 */
	__u32                      max_entries;          /*    16     4 */
	__u32                      map_flags;            /*    20     4 */
	char                       name[16];             /*    24    16 */
	__u32                      ifindex;              /*    40     4 */

	/* XXX 4 bytes hole, try to pack */

	__u64                      netns_dev;            /*    48     8 */
	__u64                      netns_ino;            /*    56     8 */
	/* --- cacheline 1 boundary (64 bytes) --- */

	/* size: 64, cachelines: 1, members: 10 */
	/* sum members: 60, holes: 1, sum holes: 4 */
  };
Reported-by: NDmitry V. Levin <ldv@altlinux.org>
Reported-by: NEugene Syromiatnikov <esyr@redhat.com>
Fixes: 52775b33 ("bpf: offload: report device information about offloaded maps")
Fixes: 675fc275 ("bpf: offload: report device information for offloaded programs")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

36f9814a

31 5月, 2018 1 次提交

drm/bridge/synopsys: dw-hdmi: fix dw_hdmi_setup_rx_sense · c32048d9

由 Neil Armstrong 提交于 5月 30, 2018

The dw_hdmi_setup_rx_sense exported function should not use struct device
to recover the dw-hdmi context using drvdata, but take struct dw_hdmi
directly like other exported functions.

This caused a regression using Meson DRM on S905X since v4.17-rc1 :

Internal error: Oops: 96000007 [#1] PREEMPT SMP
[...]
CPU: 0 PID: 124 Comm: irq/32-dw_hdmi_ Not tainted 4.17.0-rc7 #2
Hardware name: Libre Technology CC (DT)
[...]
pc : osq_lock+0x54/0x188
lr : __mutex_lock.isra.0+0x74/0x530
[...]
Process irq/32-dw_hdmi_ (pid: 124, stack limit = 0x00000000adf418cb)
Call trace:
  osq_lock+0x54/0x188
  __mutex_lock_slowpath+0x10/0x18
  mutex_lock+0x30/0x38
  __dw_hdmi_setup_rx_sense+0x28/0x98
  dw_hdmi_setup_rx_sense+0x10/0x18
  dw_hdmi_top_thread_irq+0x2c/0x50
  irq_thread_fn+0x28/0x68
  irq_thread+0x10c/0x1a0
  kthread+0x128/0x130
  ret_from_fork+0x10/0x18
 Code: 34000964 d00050a2 51000484 9135c042 (f864d844)
 ---[ end trace 945641e1fbbc07da ]---
 note: irq/32-dw_hdmi_[124] exited with preempt_count 1
 genirq: exiting task "irq/32-dw_hdmi_" (124) is an active IRQ thread (irq 32)

Fixes: eea034af ("drm/bridge/synopsys: dw-hdmi: don't clobber drvdata")
Signed-off-by: NNeil Armstrong <narmstrong@baylibre.com>
Tested-by: NKoen Kooi <koen@dominion.thruhere.net>
Signed-off-by: NSean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1527673438-20643-1-git-send-email-narmstrong@baylibre.com

c32048d9

26 5月, 2018 2 次提交

mm/memory_hotplug: fix leftover use of struct page during hotplug · a2155861

由 Jonathan Cameron 提交于 5月 25, 2018

The case of a new numa node got missed in avoiding using the node info
from page_struct during hotplug.  In this path we have a call to
register_mem_sect_under_node (which allows us to specify it is hotplug
so don't change the node), via link_mem_sections which unfortunately
does not.

Fix is to pass check_nid through link_mem_sections as well and disable
it in the new numa node path.

Note the bug only 'sometimes' manifests depending on what happens to be
in the struct page structures - there are lots of them and it only needs
to match one of them.

The result of the bug is that (with a new memory only node) we never
successfully call register_mem_sect_under_node so don't get the memory
associated with the node in sysfs and meminfo for the node doesn't
report it.

It came up whilst testing some arm64 hotplug patches, but appears to be
universal.  Whilst I'm triggering it by removing then reinserting memory
to a node with no other elements (thus making the node disappear then
appear again), it appears it would happen on hotplugging memory where
there was none before and it doesn't seem to be related the arm64
patches.

These patches call __add_pages (where most of the issue was fixed by
Pavel's patch).  If there is a node at the time of the __add_pages call
then all is well as it calls register_mem_sect_under_node from there
with check_nid set to false.  Without a node that function returns
having not done the sysfs related stuff as there is no node to use.
This is expected but it is the resulting path that fails...

Exact path to the problem is as follows:

 mm/memory_hotplug.c: add_memory_resource()

   The node is not online so we enter the 'if (new_node)' twice, on the
   second such block there is a call to link_mem_sections which calls
   into

  drivers/node.c: link_mem_sections() which calls

  drivers/node.c: register_mem_sect_under_node() which calls
     get_nid_for_pfn and keeps trying until the output of that matches
     the expected node (passed all the way down from
     add_memory_resource)

It is effectively the same fix as the one referred to in the fixes tag
just in the code path for a new node where the comments point out we
have to rerun the link creation because it will have failed in
register_new_memory (as there was no node at the time).  (actually that
comment is wrong now as we don't have register_new_memory any more it
got renamed to hotplug_memory_register in Pavel's patch).

Link: http://lkml.kernel.org/r/20180504085311.1240-1-Jonathan.Cameron@huawei.com
Fixes: fc44f7f9 ("mm/memory_hotplug: don't read nid from struct page during hotplug")
Signed-off-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: NPavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a2155861

mm: do not warn on offline nodes unless the specific node is explicitly requested · 8addc2d0

由 Michal Hocko 提交于 5月 25, 2018

Oscar has noticed that we splat

   WARNING: CPU: 0 PID: 64 at ./include/linux/gfp.h:467 vmemmap_alloc_block+0x4e/0xc9
   [...]
   CPU: 0 PID: 64 Comm: kworker/u4:1 Tainted: G        W   E     4.17.0-rc5-next-20180517-1-default+ #66
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
   Workqueue: kacpi_hotplug acpi_hotplug_work_fn
   Call Trace:
    vmemmap_populate+0xf2/0x2ae
    sparse_mem_map_populate+0x28/0x35
    sparse_add_one_section+0x4c/0x187
    __add_pages+0xe7/0x1a0
    add_pages+0x16/0x70
    add_memory_resource+0xa3/0x1d0
    add_memory+0xe4/0x110
    acpi_memory_device_add+0x134/0x2e0
    acpi_bus_attach+0xd9/0x190
    acpi_bus_scan+0x37/0x70
    acpi_device_hotplug+0x389/0x4e0
    acpi_hotplug_work_fn+0x1a/0x30
    process_one_work+0x146/0x340
    worker_thread+0x47/0x3e0
    kthread+0xf5/0x130
    ret_from_fork+0x35/0x40

when adding memory to a node that is currently offline.

The VM_WARN_ON is just too loud without a good reason.  In this
particular case we are doing

	alloc_pages_node(node, GFP_KERNEL|__GFP_RETRY_MAYFAIL|__GFP_NOWARN, order)

so we do not insist on allocating from the given node (it is more a
hint) so we can fall back to any other populated node and moreover we
explicitly ask to not warn for the allocation failure.

Soften the warning only to cases when somebody asks for the given node
explicitly by __GFP_THISNODE.

Link: http://lkml.kernel.org/r/20180523125555.30039-3-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
Reported-by: NOscar Salvador <osalvador@techadventures.net>
Tested-by: NOscar Salvador <osalvador@techadventures.net>
Reviewed-by: NPavel Tatashin <pasha.tatashin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8addc2d0

25 5月, 2018 3 次提交

sched, tracing: Fix trace_sched_pi_setprio() for deboosting · 4ff648de

由 Sebastian Andrzej Siewior 提交于 5月 24, 2018

Since the following commit:

  b91473ff ("sched,tracing: Update trace_sched_pi_setprio()")

the sched_pi_setprio trace point shows the "newprio" during a deboost:

  |futex sched_pi_setprio: comm=futex_requeue_p pid"34 oldprio newprio=3D98
  |futex sched_switch: prev_comm=futex_requeue_p prev_pid"34 prev_prio=120

This patch open codes __rt_effective_prio() in the tracepoint as the
'newprio' to get the old behaviour back / the correct priority:

  |futex sched_pi_setprio: comm=futex_requeue_p pid"20 oldprio newprio=3D120
  |futex sched_switch: prev_comm=futex_requeue_p prev_pid"20 prev_prio=120

Peter suggested to open code the new priority so people using tracehook
could get the deadline data out.
Reported-by: NMansky Christian <man@keba.com>
Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: b91473ff ("sched,tracing: Update trace_sched_pi_setprio()")
Link: http://lkml.kernel.org/r/20180524132647.gg6ziuogczdmjjzu@linutronix.deSigned-off-by: NIngo Molnar <mingo@kernel.org>

4ff648de

ppp: remove the PPPIOCDETACH ioctl · af8d3c7c

由 Eric Biggers 提交于 5月 23, 2018

The PPPIOCDETACH ioctl effectively tries to "close" the given ppp file
before f_count has reached 0, which is fundamentally a bad idea.  It
does check 'f_count < 2', which excludes concurrent operations on the
file since they would only be possible with a shared fd table, in which
case each fdget() would take a file reference.  However, it fails to
account for the fact that even with 'f_count == 1' the file can still be
linked into epoll instances.  As reported by syzbot, this can trivially
be used to cause a use-after-free.

Yet, the only known user of PPPIOCDETACH is pppd versions older than
ppp-2.4.2, which was released almost 15 years ago (November 2003).
Also, PPPIOCDETACH apparently stopped working reliably at around the
same time, when the f_count check was added to the kernel, e.g. see
https://lkml.org/lkml/2002/12/31/83.  Also, the current 'f_count < 2'
check makes PPPIOCDETACH only work in single-threaded applications; it
always fails if called from a multithreaded application.

All pppd versions released in the last 15 years just close() the file
descriptor instead.

Therefore, instead of hacking around this bug by exporting epoll
internals to modules, and probably missing other related bugs, just
remove the PPPIOCDETACH ioctl and see if anyone actually notices.  Leave
a stub in place that prints a one-time warning and returns EINVAL.

Reported-by: syzbot+16363c99d4134717c05b@syzkaller.appspotmail.com
Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: NEric Biggers <ebiggers@google.com>
Acked-by: NPaul Mackerras <paulus@ozlabs.org>
Reviewed-by: NGuillaume Nault <g.nault@alphalink.fr>
Tested-by: NGuillaume Nault <g.nault@alphalink.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

af8d3c7c

Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE" · d883c6cf

由 Joonsoo Kim 提交于 5月 23, 2018

This reverts the following commits that change CMA design in MM.

 3d2054ad ("ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM=y")

 1d47a3ec ("mm/cma: remove ALLOC_CMA")

 bad8c6c0 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")

Ville reported a following error on i386.

  Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
  microcode: microcode updated early to revision 0x4, date = 2013-06-28
  Initializing CPU#0
  Initializing HighMem for node 0 (000377fe:00118000)
  Initializing Movable for node 0 (00000001:00118000)
  BUG: Bad page state in process swapper  pfn:377fe
  page:f53effc0 count:0 mapcount:-127 mapping:00000000 index:0x0
  flags: 0x80000000()
  raw: 80000000 00000000 00000000 ffffff80 00000000 00000100 00000200 00000001
  page dumped because: nonzero mapcount
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0-rc5-elk+ #145
  Hardware name: Dell Inc. Latitude E5410/03VXMC, BIOS A15 07/11/2013
  Call Trace:
   dump_stack+0x60/0x96
   bad_page+0x9a/0x100
   free_pages_check_bad+0x3f/0x60
   free_pcppages_bulk+0x29d/0x5b0
   free_unref_page_commit+0x84/0xb0
   free_unref_page+0x3e/0x70
   __free_pages+0x1d/0x20
   free_highmem_page+0x19/0x40
   add_highpages_with_active_regions+0xab/0xeb
   set_highmem_pages_init+0x66/0x73
   mem_init+0x1b/0x1d7
   start_kernel+0x17a/0x363
   i386_start_kernel+0x95/0x99
   startup_32_smp+0x164/0x168

The reason for this error is that the span of MOVABLE_ZONE is extended
to whole node span for future CMA initialization, and, normal memory is
wrongly freed here.  I submitted the fix and it seems to work, but,
another problem happened.

It's so late time to fix the later problem so I decide to reverting the
series.
Reported-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: NLaura Abbott <labbott@redhat.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d883c6cf

24 5月, 2018 2 次提交

bpf: properly enforce index mask to prevent out-of-bounds speculation · c93552c4

由 Daniel Borkmann 提交于 5月 24, 2018

While reviewing the verifier code, I recently noticed that the
following two program variants in relation to tail calls can be
loaded.

Variant 1:

  # bpftool p d x i 15
    0: (15) if r1 == 0x0 goto pc+3
    1: (18) r2 = map[id:5]
    3: (05) goto pc+2
    4: (18) r2 = map[id:6]
    6: (b7) r3 = 7
    7: (35) if r3 >= 0xa0 goto pc+2
    8: (54) (u32) r3 &= (u32) 255
    9: (85) call bpf_tail_call#12
   10: (b7) r0 = 1
   11: (95) exit

  # bpftool m s i 5
    5: prog_array  flags 0x0
        key 4B  value 4B  max_entries 4  memlock 4096B
  # bpftool m s i 6
    6: prog_array  flags 0x0
        key 4B  value 4B  max_entries 160  memlock 4096B

Variant 2:

  # bpftool p d x i 20
    0: (15) if r1 == 0x0 goto pc+3
    1: (18) r2 = map[id:8]
    3: (05) goto pc+2
    4: (18) r2 = map[id:7]
    6: (b7) r3 = 7
    7: (35) if r3 >= 0x4 goto pc+2
    8: (54) (u32) r3 &= (u32) 3
    9: (85) call bpf_tail_call#12
   10: (b7) r0 = 1
   11: (95) exit

  # bpftool m s i 8
    8: prog_array  flags 0x0
        key 4B  value 4B  max_entries 160  memlock 4096B
  # bpftool m s i 7
    7: prog_array  flags 0x0
        key 4B  value 4B  max_entries 4  memlock 4096B

In both cases the index masking inserted by the verifier in order
to control out of bounds speculation from a CPU via b2157399
("bpf: prevent out-of-bounds speculation") seems to be incorrect
in what it is enforcing. In the 1st variant, the mask is applied
from the map with the significantly larger number of entries where
we would allow to a certain degree out of bounds speculation for
the smaller map, and in the 2nd variant where the mask is applied
from the map with the smaller number of entries, we get buggy
behavior since we truncate the index of the larger map.

The original intent from commit b2157399 is to reject such
occasions where two or more different tail call maps are used
in the same tail call helper invocation. However, the check on
the BPF_MAP_PTR_POISON is never hit since we never poisoned the
saved pointer in the first place! We do this explicitly for map
lookups but in case of tail calls we basically used the tail
call map in insn_aux_data that was processed in the most recent
path which the verifier walked. Thus any prior path that stored
a pointer in insn_aux_data at the helper location was always
overridden.

Fix it by moving the map pointer poison logic into a small helper
that covers both BPF helpers with the same logic. After that in
fixup_bpf_calls() the poison check is then hit for tail calls
and the program rejected. Latter only happens in unprivileged
case since this is the *only* occasion where a rewrite needs to
happen, and where such rewrite is specific to the map (max_entries,
index_mask). In the privileged case the rewrite is generic for
the insn->imm / insn->code update so multiple maps from different
paths can be handled just fine since all the remaining logic
happens in the instruction processing itself. This is similar
to the case of map lookups: in case there is a collision of
maps in fixup_bpf_calls() we must skip the inlined rewrite since
this will turn the generic instruction sequence into a non-
generic one. Thus the patch_call_imm will simply update the
insn->imm location where the bpf_map_lookup_elem() will later
take care of the dispatch. Given we need this 'poison' state
as a check, the information of whether a map is an unpriv_array
gets lost, so enforcing it prior to that needs an additional
state. In general this check is needed since there are some
complex and tail call intensive BPF programs out there where
LLVM tends to generate such code occasionally. We therefore
convert the map_ptr rather into map_state to store all this
w/o extra memory overhead, and the bit whether one of the maps
involved in the collision was from an unpriv_array thus needs
to be retained as well there.

Fixes: b2157399 ("bpf: prevent out-of-bounds speculation")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

c93552c4

IB/uverbs: Fix uverbs_attr_get_obj · f4602cbb

由 Jason Gunthorpe 提交于 5月 22, 2018

The err pointer comes from uverbs_attr_get, not from the uobject member,
which does not store an ERR_PTR.

Fixes: be934cca ("IB/uverbs: Add device memory registration ioctl support")
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
Reviewed-by: NLeon Romanovsky <leonro@mellanox.com>

f4602cbb

23 5月, 2018 1 次提交

sctp: fix the issue that flags are ignored when using kernel_connect · 644fbdea

由 Xin Long 提交于 5月 20, 2018

Now sctp uses inet_dgram_connect as its proto_ops .connect, and the flags
param can't be passed into its proto .connect where this flags is really
needed.

sctp works around it by getting flags from socket file in __sctp_connect.
It works for connecting from userspace, as inherently the user sock has
socket file and it passes f_flags as the flags param into the proto_ops
.connect.

However, the sock created by sock_create_kern doesn't have a socket file,
and it passes the flags (like O_NONBLOCK) by using the flags param in
kernel_connect, which calls proto_ops .connect later.

So to fix it, this patch defines a new proto_ops .connect for sctp,
sctp_inet_connect, which calls __sctp_connect() directly with this
flags param. After this, the sctp's proto .connect can be removed.

Note that sctp_inet_connect doesn't need to do some checks that are not
needed for sctp, which makes thing better than with inet_dgram_connect.
Suggested-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: NMichal Kubecek <mkubecek@suse.cz>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

644fbdea

20 5月, 2018 1 次提交

bpf: Prevent memory disambiguation attack · af86ca4e

由 Alexei Starovoitov 提交于 5月 15, 2018

Detect code patterns where malicious 'speculative store bypass' can be used
and sanitize such patterns.

 39: (bf) r3 = r10
 40: (07) r3 += -216
 41: (79) r8 = *(u64 *)(r7 +0)   // slow read
 42: (7a) *(u64 *)(r10 -72) = 0  // verifier inserts this instruction
 43: (7b) *(u64 *)(r8 +0) = r3   // this store becomes slow due to r8
 44: (79) r1 = *(u64 *)(r6 +0)   // cpu speculatively executes this load
 45: (71) r2 = *(u8 *)(r1 +0)    // speculatively arbitrary 'load byte'
                                 // is now sanitized

Above code after x86 JIT becomes:
 e5: mov    %rbp,%rdx
 e8: add    $0xffffffffffffff28,%rdx
 ef: mov    0x0(%r13),%r14
 f3: movq   $0x0,-0x48(%rbp)
 fb: mov    %rdx,0x0(%r14)
 ff: mov    0x0(%rbx),%rdi
103: movzbq 0x0(%rdi),%rsi
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

af86ca4e

19 5月, 2018 1 次提交

include/linux/mm.h: add new inline function vmf_error() · d97baf94

由 Souptick Joarder 提交于 5月 18, 2018

Many places in drivers/ file systems, error was handled in a common way
like below:

ret = (ret == -ENOMEM) ? VM_FAULT_OOM : VM_FAULT_SIGBUS;

vmf_error() will replace this and return vm_fault_t type err.

A lot of drivers and filesystems currently have a rather complex mapping
of errno-to-VM_FAULT code. We have been able to eliminate a lot of it
by just returning VM_FAULT codes directly from functions which are
called exclusively from the fault handling path.

Some functions can be called both from the fault handler and other
context which are expecting an errno, so they have to continue to return
an errno. Some users still need to choose different behaviour for
different errnos, but vmf_error() captures the essential error
translation that's common to all users, and those that need to handle
additional errors can handle them first.

Link: http://lkml.kernel.org/r/20180510174826.GA14268@jordon-HP-15-Notebook-PCSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: NMatthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d97baf94

18 5月, 2018 3 次提交

cfg80211: further limit wiphy names to 64 bytes · 81459649

由 Eric Biggers 提交于 5月 14, 2018

wiphy names were recently limited to 128 bytes by commit a7cfebcb
("cfg80211: limit wiphy names to 128 bytes").  As it turns out though,
this isn't sufficient because dev_vprintk_emit() needs the syslog header
string "SUBSYSTEM=ieee80211\0DEVICE=+ieee80211:$devname" to fit into 128
bytes.  This triggered the "device/subsystem name too long" WARN when
the device name was >= 90 bytes.  As before, this was reproduced by
syzbot by sending an HWSIM_CMD_NEW_RADIO command to the MAC80211_HWSIM
generic netlink family.

Fix it by further limiting wiphy names to 64 bytes.

Reported-by: syzbot+e64565577af34b3768dc@syzkaller.appspotmail.com
Fixes: a7cfebcb ("cfg80211: limit wiphy names to 128 bytes")
Signed-off-by: NEric Biggers <ebiggers@google.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

81459649

tls: don't use stack memory in a scatterlist · 8ab6ffba

由 Matt Mullins 提交于 5月 16, 2018

scatterlist code expects virt_to_page() to work, which fails with
CONFIG_VMAP_STACK=y.

Fixes: c46234eb ("tls: RX path for ktls")
Signed-off-by: NMatt Mullins <mmullins@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ab6ffba

proc: do not access cmdline nor environ from file-backed areas · 7f7ccc2c

由 Willy Tarreau 提交于 5月 11, 2018

proc_pid_cmdline_read() and environ_read() directly access the target
process' VM to retrieve the command line and environment. If this
process remaps these areas onto a file via mmap(), the requesting
process may experience various issues such as extra delays if the
underlying device is slow to respond.

Let's simply refuse to access file-backed areas in these functions.
For this we add a new FOLL_ANON gup flag that is passed to all calls
to access_remote_vm(). The code already takes care of such failures
(including unmapped areas). Accesses via /proc/pid/mem were not
changed though.

This was assigned CVE-2018-1120.

Note for stable backports: the patch may apply to kernels prior to 4.11
but silently miss one location; it must be checked that no call to
access_remote_vm() keeps zero as the last argument.
Reported-by: NQualys Security Advisory <qsa@qualys.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: NWilly Tarreau <w@1wt.eu>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7f7ccc2c

17 5月, 2018 1 次提交

net/mlx5: Fix build break when CONFIG_SMP=n · e3ca3488

由 Saeed Mahameed 提交于 5月 14, 2018

Avoid using the kernel's irq_descriptor and return IRQ vector affinity
directly from the driver.

This fixes the following build break when CONFIG_SMP=n

include/linux/mlx5/driver.h: In function ‘mlx5_get_vector_affinity_hint’:
include/linux/mlx5/driver.h:1299:13: error:
        ‘struct irq_desc’ has no member named ‘affinity_hint’

Fixes: 6082d9c9 ("net/mlx5: Fix mlx5_get_vector_affinity function")
Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
CC: Randy Dunlap <rdunlap@infradead.org>
CC: Guenter Roeck <linux@roeck-us.net>
CC: Thomas Gleixner <tglx@linutronix.de>
Tested-by: NIsrael Rukshin <israelr@mellanox.com>
Reported-by: Nkbuild test robot <lkp@intel.com>
Reported-by: NRandy Dunlap <rdunlap@infradead.org>
Tested-by: NRandy Dunlap <rdunlap@infradead.org>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Tested-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e3ca3488

16 5月, 2018 2 次提交

locking/percpu-rwsem: Annotate rwsem ownership transfer by setting RWSEM_OWNER_UNKNOWN · 5a817641

由 Waiman Long 提交于 5月 15, 2018

The filesystem freezing code needs to transfer ownership of a rwsem
embedded in a percpu-rwsem from the task that does the freezing to
another one that does the thawing by calling percpu_rwsem_release()
after freezing and percpu_rwsem_acquire() before thawing.

However, the new rwsem debug code runs afoul with this scheme by warning
that the task that releases the rwsem isn't the one that acquires it,
as reported by Amir Goldstein:

  DEBUG_LOCKS_WARN_ON(sem->owner != get_current())
  WARNING: CPU: 1 PID: 1401 at /home/amir/build/src/linux/kernel/locking/rwsem.c:133 up_write+0x59/0x79

  Call Trace:
   percpu_up_write+0x1f/0x28
   thaw_super_locked+0xdf/0x120
   do_vfs_ioctl+0x270/0x5f1
   ksys_ioctl+0x52/0x71
   __x64_sys_ioctl+0x16/0x19
   do_syscall_64+0x5d/0x167
   entry_SYSCALL_64_after_hwframe+0x49/0xbe

To work properly with the rwsem debug code, we need to annotate that the
rwsem ownership is unknown during the tranfer period until a brave soul
comes forward to acquire the ownership. During that period, optimistic
spinning will be disabled.
Reported-by: NAmir Goldstein <amir73il@gmail.com>
Tested-by: NAmir Goldstein <amir73il@gmail.com>
Signed-off-by: NWaiman Long <longman@redhat.com>
Acked-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jan Kara <jack@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Theodore Y. Ts'o <tytso@mit.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-fsdevel@vger.kernel.org
Link: http://lkml.kernel.org/r/1526420991-21213-3-git-send-email-longman@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

5a817641

IB/umem: Use the correct mm during ib_umem_release · 8e907ed4

由 Lidong Chen 提交于 5月 08, 2018

User-space may invoke ibv_reg_mr and ibv_dereg_mr in different threads.

If ibv_dereg_mr is called after the thread which invoked ibv_reg_mr has
exited, get_pid_task will return NULL and ib_umem_release will not
decrease mm->pinned_vm.

Instead of using threads to locate the mm, use the overall tgid from the
ib_ucontext struct instead. This matches the behavior of ODP and
disassociate in handling the mm of the process that called ibv_reg_mr.

Cc: <stable@vger.kernel.org>
Fixes: 87773dd5 ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get")
Signed-off-by: NLidong Chen <lidongchen@tencent.com>
Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>

8e907ed4

15 5月, 2018 2 次提交

mtd: rawnand: Fix return type of __DIVIDE() when called with 32-bit · 9f825e74

由 Geert Uytterhoeven 提交于 5月 14, 2018

The __DIVIDE() macro checks whether it is called with a 32-bit or 64-bit
dividend, to select the appropriate divide-and-round-up routine.
As the check uses the ternary operator, the result will always be
promoted to a type that can hold both results, i.e. unsigned long long.

When using this result in a division on a 32-bit system, this may lead
to link errors like:

    ERROR: "__udivdi3" [drivers/mtd/nand/raw/nand.ko] undefined!

Fix this by casting the result of the division to the type of the
dividend.

Fixes: 8878b126 ("mtd: nand: add ->exec_op() implementation")
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NBoris Brezillon <boris.brezillon@bootlin.com>

9f825e74

tracing/x86/xen: Remove zero data size trace events trace_xen_mmu_flush_tlb{_all} · 45dd9b06

由 Steven Rostedt (VMware) 提交于 5月 09, 2018

Doing an audit of trace events, I discovered two trace events in the xen
subsystem that use a hack to create zero data size trace events. This is not
what trace events are for. Trace events add memory footprint overhead, and
if all you need to do is see if a function is hit or not, simply make that
function noinline and use function tracer filtering.

Worse yet, the hack used was:

 __array(char, x, 0)

Which creates a static string of zero in length. There's assumptions about
such constructs in ftrace that this is a dynamic string that is nul
terminated. This is not the case with these tracepoints and can cause
problems in various parts of ftrace.

Nuke the trace events!

Link: http://lkml.kernel.org/r/20180509144605.5a220327@gandalf.local.home

Cc: stable@vger.kernel.org
Fixes: 95a7d768 ("xen/mmu: Use Xen specific TLB flush instead of the generic one.")
Reviewed-by: NJuergen Gross <jgross@suse.com>
Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>

45dd9b06

14 5月, 2018 3 次提交

afs: Add a tracepoint to record callbacks from unlisted servers · 3709a399

由 David Howells 提交于 5月 11, 2018

Add a tracepoint to record callbacks from servers for which we don't have a
record.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

3709a399

mtd: Fix comparison in map_word_andequal() · ea739a28

由 Ben Hutchings 提交于 5月 10, 2018

Commit 9e343e87 ("mtd: cfi: convert inline functions to macros")
changed map_word_andequal() into a macro, but also changed the right
hand side of the comparison from val3 to val2.  Change it back to use
val3 on the right hand side.

Thankfully this did not cause a regression because all callers
currently pass the same argument for val2 and val3.

Fixes: 9e343e87 ("mtd: cfi: convert inline functions to macros")
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NBoris Brezillon <boris.brezillon@bootlin.com>

ea739a28

efi: Avoid potential crashes, fix the 'struct efi_pci_io_protocol_32' definition for mixed mode · 0b3225ab

由 Ard Biesheuvel 提交于 5月 04, 2018

Mixed mode allows a kernel built for x86_64 to interact with 32-bit
EFI firmware, but requires us to define all struct definitions carefully
when it comes to pointer sizes.

'struct efi_pci_io_protocol_32' currently uses a 'void *' for the
'romimage' field, which will be interpreted as a 64-bit field
on such kernels, potentially resulting in bogus memory references
and subsequent crashes.
Tested-by: NHans de Goede <hdegoede@redhat.com>
Signed-off-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20180504060003.19618-13-ard.biesheuvel@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

0b3225ab

12 5月, 2018 3 次提交

rbtree: include rcu.h · 2075b16e

由 Sebastian Andrzej Siewior 提交于 5月 11, 2018

Since commit c1adf200 ("Introduce rb_replace_node_rcu()")
rbtree_augmented.h uses RCU related data structures but does not include
the header file. It works as long as it gets somehow included before
that and fails otherwise.

Link: http://lkml.kernel.org/r/20180504103159.19938-1-bigeasy@linutronix.deSigned-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2075b16e

mm, oom: fix concurrent munlock and oom reaper unmap, v3 · 27ae357f

由 David Rientjes 提交于 5月 11, 2018

Since exit_mmap() is done without the protection of mm->mmap_sem, it is
possible for the oom reaper to concurrently operate on an mm until
MMF_OOM_SKIP is set.

This allows munlock_vma_pages_all() to concurrently run while the oom
reaper is operating on a vma.  Since munlock_vma_pages_range() depends
on clearing VM_LOCKED from vm_flags before actually doing the munlock to
determine if any other vmas are locking the same memory, the check for
VM_LOCKED in the oom reaper is racy.

This is especially noticeable on architectures such as powerpc where
clearing a huge pmd requires serialize_against_pte_lookup().  If the pmd
is zapped by the oom reaper during follow_page_mask() after the check
for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a
kernel oops.

Fix this by manually freeing all possible memory from the mm before
doing the munlock and then setting MMF_OOM_SKIP.  The oom reaper can not
run on the mm anymore so the munlock is safe to do in exit_mmap().  It
also matches the logic that the oom reaper currently uses for
determining when to set MMF_OOM_SKIP itself, so there's no new risk of
excessive oom killing.

This issue fixes CVE-2018-1000200.

Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.corp.google.com
Fixes: 21292580 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: NDavid Rientjes <rientjes@google.com>
Suggested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: NMichal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>	[4.14+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

27ae357f

do d_instantiate/unlock_new_inode combinations safely · 1e2e547a

由 Al Viro 提交于 5月 04, 2018

For anything NFS-exported we do _not_ want to unlock new inode
before it has grown an alias; original set of fixes got the
ordering right, but missed the nasty complication in case of
lockdep being enabled - unlock_new_inode() does
	lockdep_annotate_inode_mutex_key(inode)
which can only be done before anyone gets a chance to touch
->i_mutex.  Unfortunately, flipping the order and doing
unlock_new_inode() before d_instantiate() opens a window when
mkdir can race with open-by-fhandle on a guessed fhandle, leading
to multiple aliases for a directory inode and all the breakage
that follows from that.

	Correct solution: a new primitive (d_instantiate_new())
combining these two in the right order - lockdep annotate, then
d_instantiate(), then the rest of unlock_new_inode().  All
combinations of d_instantiate() with unlock_new_inode() should
be converted to that.

Cc: stable@kernel.org	# 2.6.29 and later
Tested-by: NMike Marshall <hubcap@omnibond.com>
Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

1e2e547a

11 5月, 2018 4 次提交

bonding: send learning packets for vlans on slave · 21706ee8

由 Debabrata Banerjee 提交于 5月 09, 2018

There was a regression at some point from the intended functionality of
commit f60c3704 ("bonding: Fix alb mode to only use first level
vlans.")

Given the return value vlan_get_encap_level() we need to store the nest
level of the bond device, and then compare the vlan's encap level to
this. Without this, this check always fails and learning packets are
never sent.

In addition, this same commit caused a regression in the behavior of
balance_alb, which requires learning packets be sent for all interfaces
using the slave's mac in order to load balance properly. For vlan's
that have not set a user mac, we can send after checking one bit.
Otherwise we need send the set mac, albeit defeating rx load balancing
for that vlan.
Signed-off-by: NDebabrata Banerjee <dbanerje@akamai.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

21706ee8

KVM: Extend MAX_IRQ_ROUTES to 4096 for all archs · ddc9cfb7

由 Wanpeng Li 提交于 4月 26, 2018

Our virtual machines make use of device assignment by configuring
12 NVMe disks for high I/O performance. Each NVMe device has 129
MSI-X Table entries:
Capabilities: [50] MSI-X: Enable+ Count=129 Masked-Vector table: BAR=0 offset=00002000
The windows virtual machines fail to boot since they will map the number of
MSI-table entries that the NVMe hardware reported to the bus to msi routing
table, this will exceed the 1024. This patch extends MAX_IRQ_ROUTES to 4096
for all archs, in the future this might be extended again if needed.
Reviewed-by: NCornelia Huck <cohuck@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmÃ¡Å™ <rkrcmar@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: NWanpeng Li <wanpengli@tencent.com>
Signed-off-by: NTonny Lu <tonnylu@tencent.com>
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>

ddc9cfb7

rxrpc: Trace UDP transmission failure · 6b47fe1d

由 David Howells 提交于 5月 10, 2018

Add a tracepoint to log transmission failure from the UDP transport socket
being used by AF_RXRPC.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

6b47fe1d

rxrpc: Add a tracepoint to log ICMP/ICMP6 and error messages · 494337c9

由 David Howells 提交于 5月 10, 2018

Add a tracepoint to log received ICMP/ICMP6 events and other error
messages.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

494337c9

10 5月, 2018 1 次提交

libceph: add osd_req_op_extent_osd_data_bvecs() · 0010f705

由 Ilya Dryomov 提交于 5月 04, 2018

... and store num_bvecs for client code's convenience.
Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
Reviewed-by: NJeff Layton <jlayton@redhat.com>
Reviewed-by: N"Yan, Zheng" <zyan@redhat.com>

0010f705

09 5月, 2018 1 次提交

netfilter: nf_tables: bogus EBUSY in chain deletions · bb7b40ae

由 Pablo Neira Ayuso 提交于 5月 08, 2018

When removing a rule that jumps to chain and such chain in the same
batch, this bogusly hits EBUSY. Add activate and deactivate operations
to expression that can be called from the preparation and the
commit/abort phases.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

bb7b40ae

08 5月, 2018 1 次提交

net: flow_dissector: fix typo 'can by' to 'can be' · 53bc017f

由 Wolfram Sang 提交于 5月 06, 2018

Signed-off-by: NWolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

53bc017f

07 5月, 2018 1 次提交

mac80211: fix kernel-doc "bad line" warning · d1361b32

由 Randy Dunlap 提交于 4月 26, 2018

Fix 88 instances of a kernel-doc warning:
  ../include/net/mac80211.h:2083: warning: bad line:  >
Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
Cc: linux-wireless@vger.kernel.org
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

d1361b32

05 5月, 2018 4 次提交

seccomp: Move speculation migitation control to arch code · 8bf37d8c

由 Thomas Gleixner 提交于 5月 04, 2018

The migitation control is simpler to implement in architecture code as it
avoids the extra function call to check the mode. Aside of that having an
explicit seccomp enabled mode in the architecture mitigations would require
even more workarounds.

Move it into architecture code and provide a weak function in the seccomp
code. Remove the 'which' argument as this allows the architecture to decide
which mitigations are relevant for seccomp.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

8bf37d8c

seccomp: Add filter flag to opt-out of SSB mitigation · 00a02d0c

由 Kees Cook 提交于 5月 03, 2018

If a seccomp user is not interested in Speculative Store Bypass mitigation
by default, it can set the new SECCOMP_FILTER_FLAG_SPEC_ALLOW flag when
adding filters.
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

00a02d0c

prctl: Add force disable speculation · 356e4bff

由 Thomas Gleixner 提交于 5月 03, 2018

For certain use cases it is desired to enforce mitigations so they cannot
be undone afterwards. That's important for loader stubs which want to
prevent a child from disabling the mitigation again. Will also be used for
seccomp(). The extra state preserving of the prctl state for SSB is a
preparatory step for EBPF dymanic speculation control.
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

356e4bff

net: phy: broadcom: add support for BCM89610 PHY · 23b83922

由 Bhadram Varka 提交于 5月 02, 2018

It adds support for BCM89610 (Single-Port 10/100/1000BASE-T)
transceiver which is used in P3310 Tegra186 platform.
Signed-off-by: NBhadram Varka <vbhadram@nvidia.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

23b83922

04 5月, 2018 2 次提交

MAINTAINERS & files: Canonize the e-mails I use at files · 32590819

由 Mauro Carvalho Chehab 提交于 4月 25, 2018

From now on, I'll start using my @kernel.org as my development e-mail.

As such, let's remove the entries that point to the old
mchehab@s-opensource.com at MAINTAINERS file.

For the files written with a copyright with mchehab@s-opensource,
let's keep Samsung on their names, using mchehab+samsung@kernel.org,
in order to keep pointing to my employer, with sponsors the work.

For the files written before I join Samsung (on July, 4 2013),
let's just use mchehab@kernel.org.

For bug reports, we can simply point to just kernel.org, as
this will reach my mchehab+samsung inbox anyway.
Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: NBrian Warner <brian.warner@samsung.com>
Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>

32590819

sched/core: Introduce set_special_state() · b5bf9a90

由 Peter Zijlstra 提交于 4月 30, 2018

Gaurav reported a perceived problem with TASK_PARKED, which turned out
to be a broken wait-loop pattern in __kthread_parkme(), but the
reported issue can (and does) in fact happen for states that do not do
condition based sleeps.

When the 'current->state = TASK_RUNNING' store of a previous
(concurrent) try_to_wake_up() collides with the setting of a 'special'
sleep state, we can loose the sleep state.

Normal condition based wait-loops are immune to this problem, but for
sleep states that are not condition based are subject to this problem.

There already is a fix for TASK_DEAD. Abstract that and also apply it
to TASK_STOPPED and TASK_TRACED, both of which are also without
condition based wait-loop.
Reported-by: NGaurav Kohli <gkohli@codeaurora.org>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: NIngo Molnar <mingo@kernel.org>

b5bf9a90

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功