提交 · 603607de0c574e693e7cfda1aba64bf4439875b7 · openeuler / Kernel

13 2月, 2016 14 次提交

由 David S. Miller 提交于 2月 13, 2016

Gregory CLEMENT says:

====================
mvneta fixes for SMP

Following this bug report:
http://thread.gmane.org/gmane.linux.ports.arm.kernel/468173 and the
suggestions from Russell King, I reviewed all the code involving
multi-CPU. It ended with this series of patches which should improve
the stability of the driver.

During my test I found another bug which is fixed by new patch (the
second one of this new version of the series)

The two first patches fix real bugs, the others fix potential issues
in the driver.

Changelog:

v1 -> v2
Fix spinlock comment. Pointed by David Miller

v2 -> v3
 - Fix typos and mistake in the comments. Pointed by Sergei Shtylyov
 - Add a new patch fixing the CPU choice in mvneta_percpu_elect
 - Use lock in last patch to prevent remaining race condition. Pointed
   by Jisheng
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

603607de

net: mvneta: Fix race condition during stopping · 120cfa50

由 Gregory CLEMENT 提交于 2月 04, 2016

When stopping the port, the CPU notifier are still there whereas the
mvneta_stop_dev function calls mvneta_percpu_disable() on each CPUs.
It was possible to have a new CPU coming at this point which could be
racy.

This patch adds a flag preventing executing the code notifier for a new
CPU when the port is stopping. It also uses the spinlock introduces
previously. To avoid the deadlock, the lock has been moved outside the
mvneta_percpu_elect function.
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

120cfa50

net: mvneta: The mvneta_percpu_elect function should be atomic · 5888511e

由 Gregory CLEMENT 提交于 2月 04, 2016

Electing a CPU must be done in an atomic way: it should be done after or
before the removal/insertion of a CPU and this function is not reentrant.

During the loop of mvneta_percpu_elect we associates the queues to the
CPUs, if there is a topology change during this loop, then the mapping
between the CPUs and the queues could be wrong. During this loop the
interrupt mask is also updating for each CPUs, It should not be changed
in the same time by other part of the driver.

This patch adds spinlock to create the needed critical sections.
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5888511e

net: mvneta: Modify the queue related fields from each cpu · db488c10

由 Gregory CLEMENT 提交于 2月 04, 2016

In the MVNETA_INTR_* registers, the queues related fields are per cpu,
according to the datasheet (comment in [] are added by me):
"In a multi-CPU system, bits of RX[or TX] queues for which the access by
the reading[or writing] CPU is disabled are read as 0, and cannot be
cleared[or written]."

That means that each time we want to manipulate these bits we had to do
it on each cpu and not only on the current cpu.
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

db488c10

net: mvneta: Remove unused code · cde4c0fe

由 Gregory CLEMENT 提交于 2月 04, 2016

Since the commit 2dcf75e2 ("net: mvneta: Associate RX queues with
each CPU") all the percpu irq are used and disabled at initialization, so
there is no point to disable them first.
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cde4c0fe

net: mvneta: Use on_each_cpu when possible · 6b125d63

由 Gregory CLEMENT 提交于 2月 04, 2016

Instead of using a for_each_* loop in which we just call the
smp_call_function_single macro, it is more simple to directly use the
on_each_cpu macro. Moreover, this macro ensures that the calls will be
done all at once.
Suggested-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6b125d63

net: mvneta: Fix the CPU choice in mvneta_percpu_elect · cad5d847

由 Gregory CLEMENT 提交于 2月 04, 2016

When passing to the management of multiple RX queue, the
mvneta_percpu_elect function was broken. The use of the modulo can lead
to elect the wrong cpu. For example with rxq_def=2, if the CPU 2 goes
offline and then online, we ended with the third RX queue activated in
the same time on CPU 0 and CPU2, which lead to a kernel crash.

With this fix, we don't try to get "the closer" CPU if the default CPU is
gone, now we just use CPU 0 which always be there. Thanks to this, the
code becomes more readable, easier to maintain and more predicable.

Cc: stable@vger.kernel.org
Fixes: 2dcf75e2 ("net: mvneta: Associate RX queues with each CPU")
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cad5d847

net: mvneta: Fix for_each_present_cpu usage · 129219e4

由 Gregory CLEMENT 提交于 2月 04, 2016

This patch convert the for_each_present in on_each_cpu, instead of
applying on the present cpus it will be applied only on the online cpus.
This fix a bug reported on
http://thread.gmane.org/gmane.linux.ports.arm.kernel/468173.

Using the macro on_each_cpu (instead of a for_each_* loop) also ensures
that all the calls will be done all at once.

Fixes: f8642885 ("net: mvneta: Statically assign queues to CPUs")
Reported-by: NStefan Roese <stefan.roese@gmail.com>
Suggested-by: NJisheng Zhang <jszhang@marvell.com>
Suggested-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

129219e4

vsock: Fix blocking ops call in prepare_to_wait · 59888180

由 Laura Abbott 提交于 2月 04, 2016

We receoved a bug report from someone using vmware:

WARNING: CPU: 3 PID: 660 at kernel/sched/core.c:7389
__might_sleep+0x7d/0x90()
do not call blocking ops when !TASK_RUNNING; state=1 set at
[<ffffffff810fa68d>] prepare_to_wait+0x2d/0x90
Modules linked in: vmw_vsock_vmci_transport vsock snd_seq_midi
snd_seq_midi_event snd_ens1371 iosf_mbi gameport snd_rawmidi
snd_ac97_codec ac97_bus snd_seq coretemp snd_seq_device snd_pcm
snd_timer snd soundcore ppdev crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel vmw_vmci vmw_balloon i2c_piix4 shpchp parport_pc
parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs
xor raid6_pq 8021q garp stp llc mrp crc32c_intel serio_raw mptspi vmwgfx
drm_kms_helper ttm drm scsi_transport_spi mptscsih e1000 ata_generic
mptbase pata_acpi
CPU: 3 PID: 660 Comm: vmtoolsd Not tainted
4.2.0-0.rc1.git3.1.fc23.x86_64 #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop
Reference Platform, BIOS 6.00 05/20/2014
 0000000000000000 0000000049e617f3 ffff88006ac37ac8 ffffffff818641f5
 0000000000000000 ffff88006ac37b20 ffff88006ac37b08 ffffffff810ab446
 ffff880068009f40 ffffffff81c63bc0 0000000000000061 0000000000000000
Call Trace:
 [<ffffffff818641f5>] dump_stack+0x4c/0x65
 [<ffffffff810ab446>] warn_slowpath_common+0x86/0xc0
 [<ffffffff810ab4d5>] warn_slowpath_fmt+0x55/0x70
 [<ffffffff8112551d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
 [<ffffffff810fa68d>] ? prepare_to_wait+0x2d/0x90
 [<ffffffff810fa68d>] ? prepare_to_wait+0x2d/0x90
 [<ffffffff810da2bd>] __might_sleep+0x7d/0x90
 [<ffffffff812163b3>] __might_fault+0x43/0xa0
 [<ffffffff81430477>] copy_from_iter+0x87/0x2a0
 [<ffffffffa039460a>] __qp_memcpy_to_queue+0x9a/0x1b0 [vmw_vmci]
 [<ffffffffa0394740>] ? qp_memcpy_to_queue+0x20/0x20 [vmw_vmci]
 [<ffffffffa0394757>] qp_memcpy_to_queue_iov+0x17/0x20 [vmw_vmci]
 [<ffffffffa0394d50>] qp_enqueue_locked+0xa0/0x140 [vmw_vmci]
 [<ffffffffa039593f>] vmci_qpair_enquev+0x4f/0xd0 [vmw_vmci]
 [<ffffffffa04847bb>] vmci_transport_stream_enqueue+0x1b/0x20
[vmw_vsock_vmci_transport]
 [<ffffffffa047ae05>] vsock_stream_sendmsg+0x2c5/0x320 [vsock]
 [<ffffffff810fabd0>] ? wake_atomic_t_function+0x70/0x70
 [<ffffffff81702af8>] sock_sendmsg+0x38/0x50
 [<ffffffff81702ff4>] SYSC_sendto+0x104/0x190
 [<ffffffff8126e25a>] ? vfs_read+0x8a/0x140
 [<ffffffff817042ee>] SyS_sendto+0xe/0x10
 [<ffffffff8186d9ae>] entry_SYSCALL_64_fastpath+0x12/0x76

transport->stream_enqueue may call copy_to_user so it should
not be called inside a prepare_to_wait. Narrow the scope of
the prepare_to_wait to avoid the bad call. This also applies
to vsock_stream_recvmsg as well.
Reported-by: NVinson Lee <vlee@freedesktop.org>
Tested-by: NVinson Lee <vlee@freedesktop.org>
Signed-off-by: NLaura Abbott <labbott@fedoraproject.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59888180

r8169:fix system hange problem. · a2cb7ec0

由 Chun-Hao Lin 提交于 2月 05, 2016

There are typos in setting RTL8168H hardware parameters. If system install
another version driver that may cuase system hang.
Signed-off-by: NChunhao Lin <hau@realtek.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a2cb7ec0

ipv4: fix memory leaks in ip_cmsg_send() callers · 91948309

由 Eric Dumazet 提交于 2月 04, 2016

Dmitry reported memory leaks of IP options allocated in
ip_cmsg_send() when/if this function returns an error.

Callers are responsible for the freeing.

Many thanks to Dmitry for the report and diagnostic.
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

91948309

net: mvpp2: Return correct error codes · c2bb7bc5

由 Amitoj Kaur Chawla 提交于 2月 04, 2016

The return value of kzalloc on failure of allocation of memory should
be -ENOMEM and not -1.

Found using Coccinelle. A simplified version of the semantic patch
used is:

//<smpl>
@@
expression *e;
position p,q;
@@

e@q = kzalloc(...);
if@p (e == NULL) {
...
return
- -1
+ -ENOMEM
;
}
//</smpl>

This function may also return -1 after calling mpp2_prs_tcam_port_map_get.
So that the function consistently returns meaningful error values on
failure, the -1 is changed to -EINVAL.
Signed-off-by: NAmitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2bb7bc5

net: cavium: liquidio: Return correct error code · 08a965ec

由 Amitoj Kaur Chawla 提交于 2月 04, 2016

The return value of vmalloc on failure of allocation of memory should
be -ENOMEM and not -1.

Found using Coccinelle. A simplified version of the semantic patch
used is:

//<smpl>
@@
expression *e;
identifier l1;
position p,q;
@@

e@q = vmalloc(...);
if@p (e == NULL) {
...
goto l1;
}
l1:
...
return -1
+ -ENOMEM
;
//</smpl

The single call site of the containing function checks whether the
returned value is -1, so this check is changed as well. The single call
site of this call site, however, only checks whether the value is not 0,
so no further change was required.
Signed-off-by: NAmitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

08a965ec

bonding: Fix ARP monitor validation · 21a75f09

由 Jay Vosburgh 提交于 2月 02, 2016

The current logic in bond_arp_rcv will accept an incoming ARP for
validation if (a) the receiving slave is either "active" (which includes
the currently active slave, or the current ARP slave) or, (b) there is a
currently active slave, and it has received an ARP since it became active.
For case (b), the receiving slave isn't the currently active slave, and is
receiving the original broadcast ARP request, not an ARP reply from the
target.

This logic can fail if there is no currently active slave. In
this situation, the ARP probe logic cycles through all slaves, assigning
each in turn as the "current_arp_slave" for one arp_interval, then setting
that one as "active," and sending an ARP probe from that slave. The
current logic expects the ARP reply to arrive on the sending
current_arp_slave, however, due to switch FDB updating delays, the reply
may be directed to another slave.

This can arise if the bonding slaves and switch are working, but
the ARP target is not responding. When the ARP target recovers, a
condition may result wherein the ARP target host replies faster than the
switch can update its forwarding table, causing each ARP reply to be sent
to the previous current_arp_slave. This will never pass the logic in
bond_arp_rcv, as neither of the above conditions (a) or (b) are met.

Some experimentation on a LAN shows ARP reply round trips in the
200 usec range, but my available switches never update their FDB in less
than 4000 usec.

This patch changes the logic in bond_arp_rcv to additionally
accept an ARP reply for validation on any slave if there is a current ARP
slave and it sent an ARP probe during the previous arp_interval.

Fixes: aeea64ac ("bonding: don't trust arp requests unless active slave really works")
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: NJay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

21a75f09

12 2月, 2016 3 次提交

Merge tag 'gpio-v4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · c05235d5

由 Linus Torvalds 提交于 2月 11, 2016

Pull GPIO fixes from Linus Walleij:
 - Probe errorpath fix for the Altera
 - irqchip ofnode pointer added to the DaVinci driver
 - controller instance number correction for DaVinci

* tag 'gpio-v4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: davinci: Fix the number of controllers allocated
  gpio: davinci: Add the missing of-node pointer
  gpio: gpio-altera: Remove gpiochip on probe failure.

c05235d5

Merge tag 'platform-drivers-x86-v4.5-3' of... · da2f912a

由 Linus Torvalds 提交于 2月 11, 2016

Merge tag 'platform-drivers-x86-v4.5-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86

Pull x86 platform driver fixes from Darren Hart:
 "Just two small fixes for the 4.5-rc cycle:

  intel_scu_ipcutil:
   - underflow in scu_reg_access()

  intel-hid:
   - fix incorrect entries in intel_hid_keymap"

* tag 'platform-drivers-x86-v4.5-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
  intel_scu_ipcutil: underflow in scu_reg_access()
  intel-hid: fix incorrect entries in intel_hid_keymap

da2f912a

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 5de6ac75

由 Linus Torvalds 提交于 2月 11, 2016

Pull networking fixes from David Miller:

 1) Fix BPF handling of branch offset adjustmnets on backjumps, from
    Daniel Borkmann.

 2) Make sure selinux knows about SOCK_DESTROY netlink messages, from
    Lorenzo Colitti.

 3) Fix openvswitch tunnel mtu regression, from David Wragg.

 4) Fix ICMP handling of TCP sockets in syn_recv state, from Eric
    Dumazet.

 5) Fix SCTP user hmacid byte ordering bug, from Xin Long.

 6) Fix recursive locking in ipv6 addrconf, from Subash Abhinov
    Kasiviswanathan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  bpf: fix branch offset adjustment on backjumps after patching ctx expansion
  vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices
  geneve: Relax MTU constraints
  vxlan: Relax MTU constraints
  flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen
  of: of_mdio: Add marvell, 88e1145 to whitelist of PHY compatibilities.
  selinux: nlmsgtab: add SOCK_DESTROY to the netlink mapping tables
  sctp: translate network order to host order when users get a hmacid
  enic: increment devcmd2 result ring in case of timeout
  tg3: Fix for tg3 transmit queue 0 timed out when too many gso_segs
  net:Add sysctl_max_skb_frags
  tcp: do not drop syn_recv on all icmp reports
  ipv6: fix a lockdep splat
  unix: correctly track in-flight fds in sending process user_struct
  update be2net maintainers' email addresses
  dwc_eth_qos: Reset hardware before PHY start
  ipv6: addrconf: Fix recursive spin lock call

5de6ac75

11 2月, 2016 8 次提交

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · 721675fc

由 Linus Torvalds 提交于 2月 10, 2016

Pull rdma fixes from Doug Ledford:
 "A few more minor fixes for rc3:

   - One fix to ipoib
   - One fix to core sysfs code
   - Four patches that resolve an oops found in testing of ocrdma and a
     couple other ocrdma issues"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  RDMA/ocrdma: Fixing ocrdma debugfs directory remove
  RDMA/ocrdma: Fix pkey_index returned by driver in rq work completion
  RDMA/ocrdma: populate max_sge_rd in device attributes
  RDMA/ocrdma: Initialize stats resources in the driver before ib device registration.
  IB/sysfs: remove unused va_list args
  IB/IPoIB: Do not set skb truesize since using one linearskb

721675fc

bpf: fix branch offset adjustment on backjumps after patching ctx expansion · a1b14d27

由 Daniel Borkmann 提交于 2月 10, 2016

When ctx access is used, the kernel often needs to expand/rewrite
instructions, so after that patching, branch offsets have to be
adjusted for both forward and backward jumps in the new eBPF program,
but for backward jumps it fails to account the delta. Meaning, for
example, if the expansion happens exactly on the insn that sits at
the jump target, it doesn't fix up the back jump offset.

Analysis on what the check in adjust_branches() is currently doing:

  /* adjust offset of jmps if necessary */
  if (i < pos && i + insn->off + 1 > pos)
    insn->off += delta;
  else if (i > pos && i + insn->off + 1 < pos)
    insn->off -= delta;

First condition (forward jumps):

  Before:                         After:

  insns[0]                        insns[0]
  insns[1] <--- i/insn            insns[1] <--- i/insn
  insns[2] <--- pos               insns[P] <--- pos
  insns[3]                        insns[P]  `------| delta
  insns[4] <--- target_X          insns[P]   `-----|
  insns[5]                        insns[3]
                                  insns[4] <--- target_X
                                  insns[5]

First case is if we cross pos-boundary and the jump instruction was
before pos. This is handeled correctly. I.e. if i == pos, then this
would mean our jump that we currently check was the patchlet itself
that we just injected. Since such patchlets are self-contained and
have no awareness of any insns before or after the patched one, the
delta is correctly not adjusted. Also, for the second condition in
case of i + insn->off + 1 == pos, means we jump to that newly patched
instruction, so no offset adjustment are needed. That part is correct.

Second condition (backward jumps):

  Before:                         After:

  insns[0]                        insns[0]
  insns[1] <--- target_X          insns[1] <--- target_X
  insns[2] <--- pos <-- target_Y  insns[P] <--- pos <-- target_Y
  insns[3]                        insns[P]  `------| delta
  insns[4] <--- i/insn            insns[P]   `-----|
  insns[5]                        insns[3]
                                  insns[4] <--- i/insn
                                  insns[5]

Second interesting case is where we cross pos-boundary and the jump
instruction was after pos. Backward jump with i == pos would be
impossible and pose a bug somewhere in the patchlet, so the first
condition checking i > pos is okay only by itself. However, i +
insn->off + 1 < pos does not always work as intended to trigger the
adjustment. It works when jump targets would be far off where the
delta wouldn't matter. But, for example, where the fixed insn->off
before pointed to pos (target_Y), it now points to pos + delta, so
that additional room needs to be taken into account for the check.
This means that i) both tests here need to be adjusted into pos + delta,
and ii) for the second condition, the test needs to be <= as pos
itself can be a target in the backjump, too.

Fixes: 9bac3d6d ("bpf: allow extended BPF programs access skb fields")
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a1b14d27

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 74c7b2af

由 Linus Torvalds 提交于 2月 10, 2016

Pull input updates from Dmitry Torokhov:
 "Just small driver fixups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: colibri-vf50-ts - add missing #include <linux/of.h>
  Input: adp5589 - fix row 5 handling for adp5589
  Input: edt-ft5x06 - fix setting gain, offset, and threshold via device tree
  Input: vmmouse - fix absolute device registration
  Input: serio - drop warnings in case of EPROBE_DEFER from serio_find_driver()
  Input: cap11xx - add missing of_node_put
  Input: sirfsoc-onkey - allow modular build
  Input: xpad - remove unused function

74c7b2af

Merge branch 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 4e541699

由 Linus Torvalds 提交于 2月 10, 2016

Pull libata fixes from Tejun Heo:

 - PORTS_IMPL workaround for very early ahci controllers is misbehaving
   on new systems.  Disabled on recent ahci versions.

 - Old-style PIO state machine had a horrible locking problem.  Don't
   know how we've been getting away this far.  Fixed.

 - Other device specific updates.

* 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ahci: Intel DNV device IDs SATA
  libata: fix sff host state machine locking while polling
  libata-sff: use WARN instead of BUG on illegal host state machine state
  libata: disable forced PORTS_IMPL for >= AHCI 1.3
  libata: blacklist a Viking flash model for MWDMA corruption
  drivers: ata: wake port before DMA stop for ALPM

4e541699

Merge branch 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · fb0dc5f1

由 Linus Torvalds 提交于 2月 10, 2016

Pull cgroup fixes from Tejun Heo:

 - The destruction path of cgroup objects are asynchronous and
   multi-staged and some of them ended up destroying parents before
   children leading to failures in cpu and memory controllers.  Ensure
   that parents are always destroyed after children.

 - cpuset mm node migration was performed synchronously while holding
   threadgroup and cgroup mutexes and the recent threadgroup locking
   update resulted in a possible deadlock.  The migration is best effort
   and shouldn't have been performed under those locks to begin with.
   Made asynchronous.

 - Minor documentation fix.

* 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  Documentation: cgroup: Fix 'cgroup-legacy' -> 'cgroup-v1'
  cgroup: make sure a parent css isn't freed before its children
  cgroup: make sure a parent css isn't offlined before its children
  cpuset: make mm migration asynchronous

fb0dc5f1

Merge branch 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 9aece75c

由 Linus Torvalds 提交于 2月 10, 2016

Pull workqueue fixes from Tejun Heo:
 "Workqueue fixes for v4.5-rc3.

   - Remove a spurious triggering of flush dependency warning.

   - Officially break local execution guarantee of unbound work items
     and add a debug feature to flush out usages which depend on it.

   - Work around CPU -> NODE mapping becoming invalid on CPU offline.

  The branch is young but pushing out early as stable kernels are being
  affected"

* 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup
  workqueue: implement "workqueue.debug_force_rr_cpu" debug feature
  workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs
  Revert "workqueue: make sure delayed work run in local cpu"
  workqueue: skip flush dependency checks for legacy workqueues

9aece75c

workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup · d6e022f1

由 Tejun Heo 提交于 2月 03, 2016

When looking up the pool_workqueue to use for an unbound workqueue,
workqueue assumes that the target CPU is always bound to a valid NUMA
node.  However, currently, when a CPU goes offline, the mapping is
destroyed and cpu_to_node() returns NUMA_NO_NODE.

This has always been broken but hasn't triggered often enough before
874bbfe6 ("workqueue: make sure delayed work run in local cpu").
After the commit, workqueue forcifully assigns the local CPU for
delayed work items without explicit target CPU to fix a different
issue.  This widens the window where CPU can go offline while a
delayed work item is pending causing delayed work items dispatched
with target CPU set to an already offlined CPU.  The resulting
NUMA_NO_NODE mapping makes workqueue try to queue the work item on a
NULL pool_workqueue and thus crash.

While 874bbfe6 has been reverted for a different reason making the
bug less visible again, it can still happen.  Fix it by mapping
NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node().
This is a temporary workaround.  The long term solution is keeping CPU
-> NODE mapping stable across CPU off/online cycles which is being
worked on.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NMike Galbraith <umgwanakikbuti@gmail.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/g/1454424264.11183.46.camel@gmail.com
Link: http://lkml.kernel.org/g/1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.com

d6e022f1

ahci: Intel DNV device IDs SATA · 342decff

由 Alexandra Yates 提交于 2月 05, 2016

Adding Intel codename DNV platform device IDs for SATA.
Signed-off-by: NAlexandra Yates <alexandra.yates@linux.intel.com>
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org

342decff

10 2月, 2016 15 次提交

Merge branch 'ovs-tunnel-mtu' · 1902750b

由 David S. Miller 提交于 2月 10, 2016

David Wragg says:

====================
Set a large MTU on ovs-created tunnel devices

Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
transmit vxlan packets of any size, constrained only by the ability to
send out the resulting packets.  4.3 introduced netdevs corresponding
to tunnel vports.  These netdevs have an MTU, which limits the size of
a packet that can be successfully encapsulated.  The default MTU
values are low (1500 or less), which is awkwardly small in the context
of physical networks supporting jumbo frames, and leads to a
conspicuous change in behaviour for userspace.

This patch series sets the MTU on openvswitch-created netdevs to be
the relevant maximum (i.e. the maximum IP packet size minus any
relevant overhead), effectively restoring the behaviour prior to 4.3.

Where relevant, the limits on MTU values that can be directly set on
the netdevs are also relaxed.

Changes in v2:
* Extend to all openvswitch tunnel types, i.e. gre and geneve as well
* Use IP_MAX_MTU

Changes in v3:
* Fix block comment style
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1902750b

vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices · 7e059158

由 David Wragg 提交于 2月 10, 2016

Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
transmit vxlan packets of any size, constrained only by the ability to
send out the resulting packets. 4.3 introduced netdevs corresponding
to tunnel vports. These netdevs have an MTU, which limits the size of
a packet that can be successfully encapsulated. The default MTU
values are low (1500 or less), which is awkwardly small in the context
of physical networks supporting jumbo frames, and leads to a
conspicuous change in behaviour for userspace.

Instead, set the MTU on openvswitch-created netdevs to be the relevant
maximum (i.e. the maximum IP packet size minus any relevant overhead),
effectively restoring the behaviour prior to 4.3.
Signed-off-by: NDavid Wragg <david@weave.works>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e059158

geneve: Relax MTU constraints · 55e5bfb5

由 David Wragg 提交于 2月 10, 2016

Allow the MTU of geneve devices to be set to large values, in order to
exploit underlying networks with larger frame sizes.

GENEVE does not have a fixed encapsulation overhead (an openvswitch
rule can add variable length options), so there is no relevant maximum
MTU to enforce.  A maximum of IP_MAX_MTU is used instead.
Encapsulated packets that are too big for the underlying network will
get dropped on the floor.
Signed-off-by: NDavid Wragg <david@weave.works>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55e5bfb5

vxlan: Relax MTU constraints · 72564b59

由 David Wragg 提交于 2月 10, 2016

Allow the MTU of vxlan devices without an underlying device to be set
to larger values (up to a maximum based on IP packet limits and vxlan
overhead).

Previously, their MTUs could not be set to higher than the
conventional ethernet value of 1500.  This is a very arbitrary value
in the context of vxlan, and prevented vxlan devices from being able
to take advantage of jumbo frames etc.

The default MTU remains 1500, for compatibility.
Signed-off-by: NDavid Wragg <david@weave.works>
Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72564b59

gpio: davinci: Fix the number of controllers allocated · 6ec9249a

由 Lokesh Vutla 提交于 1月 28, 2016

Driver only needs to allocate for [ngpio / 32] controllers,
as each controller handles 32 gpios. But the current driver
allocates for ngpio of which the extra allocated are unused.
Fix it be registering only the required number of controllers.
Signed-off-by: NLokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: NKeerthy <j-keerthy@ti.com>
Reviewed-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

6ec9249a

gpio: davinci: Add the missing of-node pointer · 310a7e60

由 Keerthy 提交于 1月 28, 2016

Currently the first parameter of irq_domain_add_legacy is NULL.
irq_find_host function returns NULL when we do not populate the of_node
and hence irq_of_parse_and_map call fails whenever we want to request a
gpio irq. This fixes the request_irq failures for gpio interrupts.
Signed-off-by: NKeerthy <j-keerthy@ti.com>
Reviewed-by: NGrygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>

310a7e60

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · 2178cbc6

由 Linus Torvalds 提交于 2月 09, 2016

Pull module fixes from Rusty Russell:
 "Fix for async_probe module param added in 4.3 (clearly not widely used
  yet), and a much more interesting kallsyms race which has been around
  approximately forever.  This fix is more invasive, and will require
  some care in backporting, but I hated all the bandaids I could think
  of, so...

  There are some more coming, which are only for breakages introduced
  this cycle (livepatch), but wanted these in now"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  modules: fix longstanding /proc/kallsyms vs module insertion race.
  module: wrapper for symbol name.
  modules: fix modparam async_probe request

2178cbc6

Input: colibri-vf50-ts - add missing #include <linux/of.h> · ff84dabe

由 Geert Uytterhoeven 提交于 2月 09, 2016

drivers/input/touchscreen/colibri-vf50-ts.c: In function ‘vf50_ts_probe’:
drivers/input/touchscreen/colibri-vf50-ts.c:302: error: implicit declaration of function ‘of_property_read_u32’
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>

ff84dabe

Input: adp5589 - fix row 5 handling for adp5589 · 7008dafb

由 Lars-Peter Clausen 提交于 2月 09, 2016

The adp5589 has row 5, don't skip it when creating the GPIO mapping.
Otherwise the pin gets reserved as used and it is not possible to use it as
a GPIO.
Signed-off-by: NLars-Peter Clausen <lars@metafoo.de>
Acked-by: NMichael Hennerich <michael.hennerich@analog.com>
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>

7008dafb

Input: edt-ft5x06 - fix setting gain, offset, and threshold via device tree · dc262dfa

由 Philipp Zabel 提交于 2月 09, 2016

A recent patch broke parsing the gain, offset, and threshold parameters
from device tree. Instead of setting the cached values and writing them
to the correct registers during probe, it would write the values from DT
into the register address variables and never write them to the chip
during normal operation.

Fixes: 2e23b7a9 ("Input: edt-ft5x06 - use generic properties API")
Signed-off-by: NPhilipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: NDmitry Torokhov <dmitry.torokhov@gmail.com>

dc262dfa

workqueue: implement "workqueue.debug_force_rr_cpu" debug feature · f303fccb

由 Tejun Heo 提交于 2月 09, 2016

Workqueue used to guarantee local execution for work items queued
without explicit target CPU.  The guarantee is gone now which can
break some usages in subtle ways.  To flush out those cases, this
patch implements a debug feature which forces round-robin CPU
selection for all such work items.

The debug feature defaults to off and can be enabled with a kernel
parameter.  The default can be flipped with a debug config option.

If you hit this commit during bisection, please refer to 041bd12e
("Revert "workqueue: make sure delayed work run in local cpu"") for
more information and ping me.
Signed-off-by: NTejun Heo <tj@kernel.org>

f303fccb

workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs · ef557180

由 Mike Galbraith 提交于 2月 09, 2016

WORK_CPU_UNBOUND work items queued to a bound workqueue always run
locally.  This is a good thing normally, but not when the user has
asked us to keep unbound work away from certain CPUs.  Round robin
these to wq_unbound_cpumask CPUs instead, as perturbation avoidance
trumps performance.

tj: Cosmetic and comment changes.  WARN_ON_ONCE() dropped from empty
    (wq_unbound_cpumask AND cpu_online_mask).  If we want that, it
    should be done when config changes.
Signed-off-by: NMike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

ef557180

Revert "workqueue: make sure delayed work run in local cpu" · 041bd12e

由 Tejun Heo 提交于 2月 09, 2016

This reverts commit 874bbfe6.

Workqueue used to implicity guarantee that work items queued without
explicit CPU specified are put on the local CPU.  Recent changes in
timer broke the guarantee and led to vmstat breakage which was fixed
by 176bed1d ("vmstat: explicitly schedule per-cpu work on the CPU
we need it to run on").

vmstat is the most likely to expose the issue and it's quite possible
that there are other similar problems which are a lot more difficult
to trigger.  As a preventive measure, 874bbfe6 ("workqueue: make
sure delayed work run in local cpu") was applied to restore the local
CPU guarnatee.  Unfortunately, the change exposed a bug in timer code
which got fixed by 22b886dd ("timers: Use proper base migration in
add_timer_on()").  Due to code restructuring, the commit couldn't be
backported beyond certain point and stable kernels which only had
874bbfe6 started crashing.

The local CPU guarantee was accidental more than anything else and we
want to get rid of it anyway.  As, with the vmstat case fixed,
874bbfe6 is causing more problems than it's fixing, it has been
decided to take the chance and officially break the guarantee by
reverting the commit.  A debug feature will be added to force foreign
CPU assignment to expose cases relying on the guarantee and fixes for
the individual cases will be backported to stable as necessary.
Signed-off-by: NTejun Heo <tj@kernel.org>
Fixes: 874bbfe6 ("workqueue: make sure delayed work run in local cpu")
Link: http://lkml.kernel.org/g/20160120211926.GJ10810@quack.suse.cz
Cc: stable@vger.kernel.org
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Daniel Bilik <daniel.bilik@neosystem.cz>
Cc: Jan Kara <jack@suse.cz>
Cc: Shaohua Li <shli@fb.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Bilik <daniel.bilik@neosystem.cz>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>

041bd12e

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 7cf91adc

由 Linus Torvalds 提交于 2月 09, 2016

Pull crypto fixes from Herbert Xu:
 "This fixes the following issues:

  API:
   - Fix async algif_skcipher, it was broken by recent fixes.
   - Fix potential race condition in algif_skcipher with ctx.
   - Fix potential memory corruption in algif_skcipher.
   - Add missing lock to crypto_user when doing an alg dump.

  Drivers:
   - marvell/cesa was testing the wrong variable for NULL after
     allocation.
   - Fix potential double-free in atmel-sha.
   - Fix illegal call to sleepin function from atomic context in
     atmel-sha"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: marvell/cesa - fix test in mv_cesa_dev_dma_init()
  crypto: atmel-sha - remove calls of clk_prepare() from atomic contexts
  crypto: atmel-sha - fix atmel_sha_remove()
  crypto: algif_skcipher - Do not set MAY_BACKLOG on the async path
  crypto: algif_skcipher - Do not dereference ctx without socket lock
  crypto: algif_skcipher - Do not assume that req is unchanged
  crypto: user - lock crypto_alg_list on alg dump

7cf91adc

scripts: add "prune-kernel" script to clean up old kernel images · b64e86cd

由 J. Bruce Fields 提交于 7月 10, 2013

Long ago, Dave Jones complained about CONFIG_LOCALVERSION_AUTO:
 "I don't use the auto config, because I end up filling up /boot unless
  I go through and clean them out by hand every time I install a new one
  (which I do probably a dozen or so times a day).  Is there some easy
  way to prune old builds I'm missing?"

To which Bruce replied:
 "I run this by hand every now and then.  I'm probably doing it all wrong"

And if he is running it wrong, then so am I - because I've been using
this script ever since.  It is true that CONFIG_LOCALVERSION_AUTO easily
ends up filling your /boot partition if you don't clean up old versions
regularly, and this script helps make that easier.

Checked with Bruce to see that it's fine to add this to the kernel
scripts.  Maybe people will come up with enhancements, but more
importantly, this way I won't misplace this script whenever I install a
new machine and start doing custom kernels for it.
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b64e86cd

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功