- 08 11月, 2017 7 次提交
-
-
由 Jay Vosburgh 提交于
The bonding miimon logic has a flaw, in that a failure of the rtnl_trylock can cause a slave to become permanently stuck in BOND_LINK_FAIL state. The sequence of events to cause this is as follows: 1) bond_miimon_inspect finds that a slave's link is down, and so calls bond_propose_link_state, setting slave->new_link_state to BOND_LINK_FAIL, then sets slave->new_link to BOND_LINK_DOWN and returns non-zero. 2) In bond_mii_monitor, the rtnl_trylock fails, and the timer is rescheduled. No change is committed. 3) bond_miimon_inspect is called again, but this time the slave from step 1 has recovered. slave->new_link is reset to NOCHANGE, and, as slave->link was never changed, the switch enters the BOND_LINK_UP case, and does nothing. The pending BOND_LINK_FAIL state from step 1 remains pending, as new_link_state is not reset. 4) The state from step 3 persists until another slave changes link state and causes bond_miimon_inspect to return non-zero. At this point, the BOND_LINK_FAIL state change on the slave from steps 1-3 is committed, and the slave will remain stuck in BOND_LINK_FAIL state even though it is actually link up. The remedy for this is to initialize new_link_state on each entry to bond_miimon_inspect, as is already done with new_link. Fixes: fb9eb899 ("bonding: handle link transition from FAIL to UP correctly") Reported-by: NAlex Sidorenko <alexandre.sidorenko@hpe.com> Reviewed-by: NJarod Wilson <jarod@redhat.com> Signed-off-by: NJay Vosburgh <jay.vosburgh@canonical.com> Acked-by: NMahesh Bandewar <maheshb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bjorn Andersson 提交于
Registering qrtr with module_init makes the ability of typical platform code to create AF_QIPCRTR socket during probe a matter of link order luck. Moving qrtr to postcore_initcall() avoids this. Signed-off-by: NBjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bjørn Mork 提交于
A CDC Ethernet functional descriptor with wMaxSegmentSize = 0 will cause a divide error in usbnet_probe: divide error: 0000 [#1] PREEMPT SMP KASAN Modules linked in: CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc8-44453-g1fdc1a82c34f #56 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: usb_hub_wq hub_event task: ffff88006bef5c00 task.stack: ffff88006bf60000 RIP: 0010:usbnet_update_max_qlen+0x24d/0x390 drivers/net/usb/usbnet.c:355 RSP: 0018:ffff88006bf67508 EFLAGS: 00010246 RAX: 00000000000163c8 RBX: ffff8800621fce40 RCX: ffff8800621fcf34 RDX: 0000000000000000 RSI: ffffffff837ecb7a RDI: ffff8800621fcf34 RBP: ffff88006bf67520 R08: ffff88006bef5c00 R09: ffffed000c43f881 R10: ffffed000c43f880 R11: ffff8800621fc406 R12: 0000000000000003 R13: ffffffff85c71de0 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88006ca00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffe9c0d6dac CR3: 00000000614f4000 CR4: 00000000000006f0 Call Trace: usbnet_probe+0x18b5/0x2790 drivers/net/usb/usbnet.c:1783 qmi_wwan_probe+0x133/0x220 drivers/net/usb/qmi_wwan.c:1338 usb_probe_interface+0x324/0x940 drivers/usb/core/driver.c:361 really_probe drivers/base/dd.c:413 driver_probe_device+0x522/0x740 drivers/base/dd.c:557 Fix by simply ignoring the bogus descriptor, as it is optional for QMI devices anyway. Fixes: 423ce8ca ("net: usb: qmi_wwan: New driver for Huawei QMI based WWAN devices") Reported-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NBjørn Mork <bjorn@mork.no> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bjørn Mork 提交于
Setting dev->hard_mtu to 0 will cause a divide error in usbnet_probe. Protect against devices with bogus CDC Ethernet functional descriptors by ignoring a zero wMaxSegmentSize. Signed-off-by: NBjørn Mork <bjorn@mork.no> Acked-by: NOliver Neukum <oneukum@suse.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hangbin Liu 提交于
After commit 07f4c900 ("tcp/dccp: try to not exhaust ip_local_port_range in connect()"), we will try to use even ports for connect(). Then if an application (seen clearly with iperf) opens multiple streams to the same destination IP and port, each stream will be given an even source port. So the bonding driver's simple xmit_hash_policy based on layer3+4 addressing will always hash all these streams to the same interface. And the total throughput will limited to a single slave. Change the tcp code will impact the whole tcp behavior, only for bonding usage. Paolo Abeni suggested fix this by changing the bonding code only, which should be more reasonable, and less impact. Fix this by discarding the lowest hash bit because it contains little entropy. After the fix we can re-balance between slaves. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NHangbin Liu <liuhangbin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gustavo A. R. Silva 提交于
hn is being kfree'd in mlx5e_del_l2_from_hash and then dereferenced by accessing hn->ai.addr Fix this by copying the MAC address into a local variable for its safe use in all possible execution paths within function mlx5e_execute_l2_action. Addresses-Coverity-ID: 1417789 Fixes: eeb66cdb ("net/mlx5: Separate between E-Switch and MPFS") Signed-off-by: NGustavo A. R. Silva <garsilva@embeddedor.com> Acked-by: NSaeed Mahameed <saeedm@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Marc Zyngier 提交于
The mvpp2 driver can't cope at all with the TX affinities being changed from userspace, and spit an endless stream of [ 91.779920] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing [ 91.779930] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing [ 91.780402] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing [ 91.780406] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing [ 91.780415] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing [ 91.780418] mvpp2 f4000000.ethernet eth2: wrong cpu on the end of Tx processing rendering the box completely useless (I've measured around 600k interrupts/s on a 8040 box) once irqbalance kicks in and start doing its job. Obviously, the driver was never designed with this in mind. So let's work around the problem by preventing userspace from interacting with these interrupts altogether. Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 11月, 2017 2 次提交
-
-
由 Priyaranjan Jha 提交于
Fixes DSACK-based undo when sender is in Open State and an ACK advances snd_una. Example scenario: - Sender goes into recovery and makes some spurious rtx. - It comes out of recovery and enters into open state. - It sends some more packets, let's say 4. - The receiver sends an ACK for the first two, but this ACK is lost. - The sender receives ack for first two, and DSACK for previous spurious rtx. Signed-off-by: NPriyaranjan Jha <priyarjha@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NYousuk Seung <ysseung@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Guillaume Nault 提交于
Using l2tp_tunnel_find() in l2tp_ip_recv() is wrong for two reasons: * It doesn't take a reference on the returned tunnel, which makes the call racy wrt. concurrent tunnel deletion. * The lookup is only based on the tunnel identifier, so it can return a tunnel that doesn't match the packet's addresses or protocol. For example, a packet sent to an L2TPv3 over IPv6 tunnel can be delivered to an L2TPv2 over UDPv4 tunnel. This is worse than a simple cross-talk: when delivering the packet to an L2TP over UDP tunnel, the corresponding socket is UDP, where ->sk_backlog_rcv() is NULL. Calling sk_receive_skb() will then crash the kernel by trying to execute this callback. And l2tp_tunnel_find() isn't even needed here. __l2tp_ip_bind_lookup() properly checks the socket binding and connection settings. It was used as a fallback mechanism for finding tunnels that didn't have their data path registered yet. But it's not limited to this case and can be used to replace l2tp_tunnel_find() in the general case. Fix l2tp_ip6 in the same way. Fixes: 0d76751f ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support") Fixes: a32e0eec ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 11月, 2017 13 次提交
-
-
由 Andrey Konovalov 提交于
When asix_suspend() is called dev->driver_priv might not have been assigned a value, so we need to check that it's not NULL. Found by syzkaller. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN Modules linked in: CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c #400 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Workqueue: usb_hub_wq hub_event task: ffff88006bb36300 task.stack: ffff88006bba8000 RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629 RSP: 0018:ffff88006bbae718 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644 RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008 RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40 R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80 FS: 0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0 Call Trace: usb_suspend_interface drivers/usb/core/driver.c:1209 usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314 usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852 __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334 rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461 rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596 __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009 pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251 usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487 hub_port_connect drivers/usb/core/hub.c:4903 hub_port_connect_change drivers/usb/core/hub.c:5009 port_event drivers/usb/core/hub.c:5115 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119 worker_thread+0x221/0x1850 kernel/workqueue.c:2253 kthread+0x3a1/0x470 kernel/kthread.c:231 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431 Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00 00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718 ---[ end trace dfc4f5649284342c ]--- Signed-off-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ye Yin 提交于
When run ipvs in two different network namespace at the same host, and one ipvs transport network traffic to the other network namespace ipvs. 'ipvs_property' flag will make the second ipvs take no effect. So we should clear 'ipvs_property' when SKB network namespace changed. Fixes: 621e84d6 ("dev: introduce skb_scrub_packet()") Signed-off-by: NYe Yin <hustcat@gmail.com> Signed-off-by: NWei Zhou <chouryzhou@gmail.com> Signed-off-by: NJulian Anastasov <ja@ssi.bg> Signed-off-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ganesh Goudar 提交于
Change t4fw_version.h to update latest firmware version number to 1.16.63.0. Signed-off-by: NGanesh Goudar <ganeshgr@chelsio.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux由 Linus Torvalds 提交于
Pull clk fix from Stephen Boyd: "One fix for USB clks on Uniphier PXs3 SoCs" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: uniphier: fix clock data for PXs3
-
git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile由 Linus Torvalds 提交于
Pull arch/tile fixes from Chris Metcalf: "Two one-line bug fixes" * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: arch/tile: Implement ->set_state_oneshot_stopped() tile: pass machine size to sparse
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi由 Linus Torvalds 提交于
Pull SCSI fix from James Bottomley: "One minor fix in the error leg of the qla2xxx driver (it oopses the system if we get an error trying to start the internal kernel thread). The fix is minor because the problem isn't often encountered in the field (although it can be induced by inserting the module in a low memory environment)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qla2xxx: Fix oops in qla2x00_probe_one error path
-
由 Chris Metcalf 提交于
set_state_oneshot_stopped() is called by the clkevt core, when the next event is required at an expiry time of 'KTIME_MAX'. This normally happens with NO_HZ_{IDLE|FULL} in both LOWRES/HIGHRES modes. This patch makes the clockevent device to stop on such an event, to avoid spurious interrupts, as explained by: commit 8fff52fd ("clockevents: Introduce CLOCK_EVT_STATE_ONESHOT_STOPPED state"). Signed-off-by: NChris Metcalf <cmetcalf@mellanox.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux由 Linus Torvalds 提交于
Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 4.14. This is bigger than I like to send at rc7, but that's at least partly because I didn't send any fixes last week. If it wasn't for the IMC driver, which is new and getting heavy testing, the diffstat would look a bit better. I've also added ftrace on big endian to my test suite, so we shouldn't break that again in future. - A fix to the handling of misaligned paste instructions (P9 only), where a change to a #define has caused the check for the instruction to always fail. - The preempt handling was unbalanced in the radix THP flush (P9 only). Though we don't generally use preempt we want to keep it working as much as possible. - Two fixes for IMC (P9 only), one when booting with restricted number of CPUs and one in the error handling when initialisation fails due to firmware etc. - A revert to fix function_graph on big endian machines, and then a rework of the reverted patch to fix kprobes blacklist handling on big endian machines. Thanks to: Anju T Sudhakar, Guilherme G. Piccoli, Madhavan Srinivasan, Naveen N. Rao, Nicholas Piggin, Paul Mackerras" * tag 'powerpc-4.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/perf: Fix core-imc hotplug callback failure during imc initialization powerpc/kprobes: Dereference function pointers only if the address does not belong to kernel text Revert "powerpc64/elfv1: Only dereference function descriptor for non-text symbols" powerpc/64s/radix: Fix preempt imbalance in TLB flush powerpc: Fix check for copy/paste instructions in alignment handler powerpc/perf: Fix IMC allocation routine
-
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc由 Linus Torvalds 提交于
Pull MMC fixes from Ulf Hansson: "Fix dw_mmc request timeout issues" * tag 'mmc-v4.14-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: dw_mmc: Fix the DTO timeout calculation mmc: dw_mmc: Add locking to the CTO timer mmc: dw_mmc: Fix the CTO timeout calculation mmc: dw_mmc: cancel the CTO timer after a voltage switch
-
git://people.freedesktop.org/~airlied/linux由 Linus Torvalds 提交于
Pull drm fixes from Dave Airlie: - one nouveau regression fix - some amdgpu fixes for stable to fix hangs on some harvested Polaris GPUs - a set of KASAN and regression fixes for i915, their CI system seems to be working pretty well now. * tag 'drm-fixes-for-v4.14-rc8' of git://people.freedesktop.org/~airlied/linux: drm/amdgpu: allow harvesting check for Polaris VCE drm/amdgpu: return -ENOENT from uvd 6.0 early init for harvesting drm/i915: Check incoming alignment for unfenced buffers (on i915gm) drm/nouveau/kms/nv50: use the correct state for base channel notifier setup drm/i915: Hold rcu_read_lock when iterating over the radixtree (vma idr) drm/i915: Hold rcu_read_lock when iterating over the radixtree (objects) drm/i915/edp: read edp display control registers unconditionally drm/i915: Do not rely on wm preservation for ILK watermarks drm/i915: Cancel the modeset retry work during modeset cleanup
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net由 Linus Torvalds 提交于
Pull networking fixes from David Miller: "Hopefully this is the last batch of networking fixes for 4.14 Fingers crossed... 1) Fix stmmac to use the proper sized OF property read, from Bhadram Varka. 2) Fix use after free in net scheduler tc action code, from Cong Wang. 3) Fix SKB control block mangling in tcp_make_synack(). 4) Use proper locking in fib_dump_info(), from Florian Westphal. 5) Fix IPG encodings in systemport driver, from Florian Fainelli. 6) Fix division by zero in NV TCP congestion control module, from Konstantin Khlebnikov. 7) Fix use after free in nf_reject_ipv4, from Tejaswi Tanikella" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net: systemport: Correct IPG length settings tcp: do not mangle skb->cb[] in tcp_make_synack() fib: fib_dump_info can no longer use __in_dev_get_rtnl stmmac: use of_property_read_u32 instead of read_u8 net_sched: hold netns refcnt for each action net_sched: acquire RTNL in tc_action_net_exit() net: vrf: correct FRA_L3MDEV encode type tcp_nv: fix division by zero in tcpnv_acked() netfilter: nf_reject_ipv4: Fix use-after-free in send_reset netfilter: nft_set_hash: disable fast_ops for 2-len keys
-
由 Linus Torvalds 提交于
Merge misc fixes from Andrew Morton: "7 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, swap: fix race between swap count continuation operations mm/huge_memory.c: deposit page table when copying a PMD migration entry initramfs: fix initramfs rebuilds w/ compression after disabling fs/hugetlbfs/inode.c: fix hwpoison reserve accounting ocfs2: fstrim: Fix start offset of first cluster group during fstrim mm, /proc/pid/pagemap: fix soft dirty marking for PMD migration entry userfaultfd: hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size
-
由 Paul Burton 提交于
MIPS will soon not be a part of Imagination Technologies, and as such many @imgtec.com email addresses will no longer be valid. This patch updates the addresses for those who: - Have 10 or more patches in mainline authored using an @imgtec.com email address, or any patches dated within the past year. - Are still with Imagination but leaving as part of the MIPS business unit, as determined from an internal email address list. - Haven't already updated their email address (ie. JamesH) or expressed a desire to be excluded (ie. Maciej). - Acked v2 or earlier of this patch, which leaves Deng-Cheng, Matt & myself. New addresses are of the form firstname.lastname@mips.com, and all verified against an internal email address list. An entry is added to .mailmap for each person such that get_maintainer.pl will report the new addresses rather than @imgtec.com addresses which will soon be dead. Instances of the affected addresses throughout the tree are then mechanically replaced with the new @mips.com address. Signed-off-by: NPaul Burton <paul.burton@mips.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@mips.com> Acked-by: NDengcheng Zhu <dengcheng.zhu@mips.com> Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Matt Redfearn <matt.redfearn@mips.com> Acked-by: NMatt Redfearn <matt.redfearn@mips.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: trivial@kernel.org Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 11月, 2017 18 次提交
-
-
由 Rafael J. Wysocki 提交于
Commit 890da9cf (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"") is not sufficient to restore the previous behavior of "cpu MHz" in /proc/cpuinfo on x86 due to some changes made after the commit it has reverted. To address this, make the code in question use arch_freq_get_on_cpu() which also is used by cpufreq for reporting the current frequency of CPUs and since that function doesn't really depend on cpufreq in any way, drop the CONFIG_CPU_FREQ dependency for the object file containing it. Also refactor arch_freq_get_on_cpu() somewhat to avoid IPIs and return cached values right away if it is called very often over a short time (to prevent user space from triggering IPI storms through it). Fixes: 890da9cf (Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"") Cc: stable@kernel.org # 4.13 - together with 890da9cfSigned-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Huang Ying 提交于
One page may store a set of entries of the sis->swap_map (swap_info_struct->swap_map) in multiple swap clusters. If some of the entries has sis->swap_map[offset] > SWAP_MAP_MAX, multiple pages will be used to store the set of entries of the sis->swap_map. And the pages are linked with page->lru. This is called swap count continuation. To access the pages which store the set of entries of the sis->swap_map simultaneously, previously, sis->lock is used. But to improve the scalability of __swap_duplicate(), swap cluster lock may be used in swap_count_continued() now. This may race with add_swap_count_continuation() which operates on a nearby swap cluster, in which the sis->swap_map entries are stored in the same page. The race can cause wrong swap count in practice, thus cause unfreeable swap entries or software lockup, etc. To fix the race, a new spin lock called cont_lock is added to struct swap_info_struct to protect the swap count continuation page list. This is a lock at the swap device level, so the scalability isn't very well. But it is still much better than the original sis->lock, because it is only acquired/released when swap count continuation is used. Which is considered rare in practice. If it turns out that the scalability becomes an issue for some workloads, we can split the lock into some more fine grained locks. Link: http://lkml.kernel.org/r/20171017081320.28133-1-ying.huang@intel.com Fixes: 235b6217 ("mm/swap: add cluster lock") Signed-off-by: N"Huang, Ying" <ying.huang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shaohua Li <shli@kernel.org> Cc: Tim Chen <tim.c.chen@intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Aaron Lu <aaron.lu@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> [4.11+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Zi Yan 提交于
We need to deposit pre-allocated PTE page table when a PMD migration entry is copied in copy_huge_pmd(). Otherwise, we will leak the pre-allocated page and cause a NULL pointer dereference later in zap_huge_pmd(). The missing counters during PMD migration entry copy process are added as well. The bug report is here: https://lkml.org/lkml/2017/10/29/214 Link: http://lkml.kernel.org/r/20171030144636.4836-1-zi.yan@sent.com Fixes: 84c3fc4e ("mm: thp: check pmd migration entry in common path") Signed-off-by: NZi Yan <zi.yan@cs.rutgers.edu> Reported-by: NFengguang Wu <fengguang.wu@intel.com> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Florian Fainelli 提交于
This is a follow-up to commit 57ddfdaa ("initramfs: fix disabling of initramfs (and its compression)"). This particular commit fixed the use case where we build the kernel with an initramfs with no compression, and then we build the kernel with no initramfs. Now this still left us with the same case as described here: http://lkml.kernel.org/r/20170521033337.6197-1-f.fainelli@gmail.com not working with initramfs compression. This can be seen by the following steps/timestamps: https://www.spinics.net/lists/kernel/msg2598153.html .initramfs_data.cpio.gz.cmd is correct: cmd_usr/initramfs_data.cpio.gz := /bin/bash ./scripts/gen_initramfs_list.sh -o usr/initramfs_data.cpio.gz -u 1000 -g 1000 /home/fainelli/work/uclinux-rootfs/romfs /home/fainelli/work/uclinux-rootfs/misc/initramfs.dev and was generated the first time we did generate the gzip initramfs, so the command has not changed, nor its arguments, so we just don't call it, no initramfs cpio is re-generated as a consequence. The fix for this problem is just to properly keep track of the .initramfs_cpio_data.d file by suffixing it with the compression extension. This takes care of properly tracking dependencies such that the initramfs get (re)generated any time files are added/deleted etc. Link: http://lkml.kernel.org/r/20170930033936.6722-1-f.fainelli@gmail.com Fixes: db2aa7fd ("initramfs: allow again choice of the embedded initramfs compression algorithm") Fixes: 9e3596b0 ("kbuild: initramfs cleanup, set target from Kconfig") Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Cc: "Francisco Blas Izquierdo Riera (klondike)" <klondike@xiscosoft.net> Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Mike Kravetz 提交于
Calling madvise(MADV_HWPOISON) on a hugetlbfs page will result in bad (negative) reserved huge page counts. This may not happen immediately, but may happen later when the underlying file is removed or filesystem unmounted. For example: AnonHugePages: 0 kB ShmemHugePages: 0 kB HugePages_Total: 1 HugePages_Free: 0 HugePages_Rsvd: 18446744073709551615 HugePages_Surp: 0 Hugepagesize: 2048 kB In routine hugetlbfs_error_remove_page(), hugetlb_fix_reserve_counts is called after remove_huge_page. hugetlb_fix_reserve_counts is designed to only be called/used only if a failure is returned from hugetlb_unreserve_pages. Therefore, call hugetlb_unreserve_pages as required and only call hugetlb_fix_reserve_counts in the unlikely event that hugetlb_unreserve_pages returns an error. Link: http://lkml.kernel.org/r/20171019230007.17043-2-mike.kravetz@oracle.com Fixes: 78bb9203 ("mm: hwpoison: dissolve in-use hugepage in unrecoverable memory error") Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com> Acked-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Ashish Samant 提交于
The first cluster group descriptor is not stored at the start of the group but at an offset from the start. We need to take this into account while doing fstrim on the first cluster group. Otherwise we will wrongly start fstrim a few blocks after the desired start block and the range can cross over into the next cluster group and zero out the group descriptor there. This can cause filesytem corruption that cannot be fixed by fsck. Link: http://lkml.kernel.org/r/1507835579-7308-1-git-send-email-ashish.samant@oracle.comSigned-off-by: NAshish Samant <ashish.samant@oracle.com> Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com> Reviewed-by: NJoseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Huang Ying 提交于
When the pagetable is walked in the implementation of /proc/<pid>/pagemap, pmd_soft_dirty() is used for both the PMD huge page map and the PMD migration entries. That is wrong, pmd_swp_soft_dirty() should be used for the PMD migration entries instead because the different page table entry flag is used. As a result, /proc/pid/pagemap may report incorrect soft dirty information for PMD migration entries. Link: http://lkml.kernel.org/r/20171017081818.31795-1-ying.huang@intel.com Fixes: 84c3fc4e ("mm: thp: check pmd migration entry in common path") Signed-off-by: N"Huang, Ying" <ying.huang@intel.com> Acked-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Hugh Dickins <hughd@google.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Daniel Colascione <dancol@google.com> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Andrea Arcangeli 提交于
This oops: kernel BUG at fs/hugetlbfs/inode.c:484! RIP: remove_inode_hugepages+0x3d0/0x410 Call Trace: hugetlbfs_setattr+0xd9/0x130 notify_change+0x292/0x410 do_truncate+0x65/0xa0 do_sys_ftruncate.constprop.3+0x11a/0x180 SyS_ftruncate+0xe/0x10 tracesys+0xd9/0xde was caused by the lack of i_size check in hugetlb_mcopy_atomic_pte. mmap() can still succeed beyond the end of the i_size after vmtruncate zapped vmas in those ranges, but the faults must not succeed, and that includes UFFDIO_COPY. We could differentiate the retval to userland to represent a SIGBUS like a page fault would do (vs SIGSEGV), but it doesn't seem very useful and we'd need to pick a random retval as there's no meaningful syscall retval that would differentiate from SIGSEGV and SIGBUS, there's just -EFAULT. Link: http://lkml.kernel.org/r/20171016223914.2421-2-aarcange@redhat.comSigned-off-by: NAndrea Arcangeli <aarcange@redhat.com> Reviewed-by: NMike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Florian Fainelli 提交于
Due to a documentation mistake, the IPG length was set to 0x12 while it should have been 12 (decimal). This would affect short packet (64B typically) performance since the IPG was bigger than necessary. Fixes: 44a4524c ("net: systemport: Add support for SYSTEMPORT Lite") Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Christoph Paasch sent a patch to address the following issue : tcp_make_synack() is leaving some TCP private info in skb->cb[], then send the packet by other means than tcp_transmit_skb() tcp_transmit_skb() makes sure to clear skb->cb[] to not confuse IPv4/IPV6 stacks, but we have no such cleanup for SYNACK. tcp_make_synack() should not use tcp_init_nondata_skb() : tcp_init_nondata_skb() really should be limited to skbs put in write/rtx queues (the ones that are only sent via tcp_transmit_skb()) This patch fixes the issue and should even save few cpu cycles ;) Fixes: 971f10ec ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NChristoph Paasch <cpaasch@apple.com> Reviewed-by: NChristoph Paasch <cpaasch@apple.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Westphal 提交于
syzbot reported yet another regression added with DOIT_UNLOCKED. When nexthop is marked as dead, fib_dump_info uses __in_dev_get_rtnl(): ./include/linux/inetdevice.h:230 suspicious rcu_dereference_protected() usage! rcu_scheduler_active = 2, debug_locks = 1 1 lock held by syz-executor2/23859: #0: (rcu_read_lock){....}, at: [<ffffffff840283f0>] inet_rtm_getroute+0xaa0/0x2d70 net/ipv4/route.c:2738 [..] lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4665 __in_dev_get_rtnl include/linux/inetdevice.h:230 [inline] fib_dump_info+0x1136/0x13d0 net/ipv4/fib_semantics.c:1377 inet_rtm_getroute+0xf97/0x2d70 net/ipv4/route.c:2785 .. This isn't safe anymore, callers either hold RTNL mutex or rcu read lock, so these spots must use rcu_dereference_rtnl() or plain rcu_derefence() (plus unconditional rcu read lock). This does the latter. Fixes: 394f51ab ("ipv4: route: set ipv4 RTM_GETROUTE to not use rtnl") Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Bhadram Varka 提交于
Numbers in DT are stored in “cells” which are 32-bits in size. of_property_read_u8 does not work properly because of endianness problem. This causes it to always return 0 with little-endian architectures. Fix it by using of_property_read_u32() OF API. Signed-off-by: NBhadram Varka <vbhadram@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
Cong Wang says: ==================== net_sched: fix a use-after-free for tc actions This patchset fixes a use-after-free reported by Lucas and closes potential races too. Please see each patch for details. ==================== Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
TC actions have been destroyed asynchronously for a long time, previously in a RCU callback and now in a workqueue. If we don't hold a refcnt for its netns, we could use the per netns data structure, struct tcf_idrinfo, after it has been freed by netns workqueue. Hold refcnt to ensure netns destroy happens after all actions are gone. Fixes: ddf97ccd ("net_sched: add network namespace support for tc actions") Reported-by: NLucas Bates <lucasb@mojatatu.com> Tested-by: NLucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
I forgot to acquire RTNL in tc_action_net_exit() which leads that action ops->cleanup() is not always called with RTNL. This usually is not a big deal because this function is called after all netns refcnt are gone, but given RTNL protects more than just actions, add it for safety and consistency. Also add an assertion to catch other potential bugs. Fixes: ddf97ccd ("net_sched: add network namespace support for tc actions") Reported-by: NLucas Bates <lucasb@mojatatu.com> Tested-by: NLucas Bates <lucasb@mojatatu.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Madhavan Srinivasan 提交于
Call trace observed during boot: nest_capp0_imc performance monitor hardware support registered nest_capp1_imc performance monitor hardware support registered core_imc memory allocation for cpu 56 failed Unable to handle kernel paging request for data at address 0xffa400010 Faulting instruction address: 0xc000000000bf3294 0:mon> e cpu 0x0: Vector: 300 (Data Access) at [c000000ff38ff8d0] pc: c000000000bf3294: mutex_lock+0x34/0x90 lr: c000000000bf3288: mutex_lock+0x28/0x90 sp: c000000ff38ffb50 msr: 9000000002009033 dar: ffa400010 dsisr: 80000 current = 0xc000000ff383de00 paca = 0xc000000007ae0000 softe: 0 irq_happened: 0x01 pid = 13, comm = cpuhp/0 Linux version 4.11.0-39.el7a.ppc64le (mockbuild@ppc-058.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC) ) #1 SMP Tue Oct 3 07:42:44 EDT 2017 0:mon> t [c000000ff38ffb80] c0000000002ddfac perf_pmu_migrate_context+0xac/0x470 [c000000ff38ffc40] c00000000011385c ppc_core_imc_cpu_offline+0x1ac/0x1e0 [c000000ff38ffc90] c000000000125758 cpuhp_invoke_callback+0x198/0x5d0 [c000000ff38ffd00] c00000000012782c cpuhp_thread_fun+0x8c/0x3d0 [c000000ff38ffd60] c0000000001678d0 smpboot_thread_fn+0x290/0x2a0 [c000000ff38ffdc0] c00000000015ee78 kthread+0x168/0x1b0 [c000000ff38ffe30] c00000000000b368 ret_from_kernel_thread+0x5c/0x74 While registering the cpuhoplug callbacks for core-imc, if we fails in the cpuhotplug online path for any random core (either because opal call to initialize the core-imc counters fails or because memory allocation fails for that core), ppc_core_imc_cpu_offline() will get invoked for other cpus who successfully returned from cpuhotplug online path. But in the ppc_core_imc_cpu_offline() path we are trying to migrate the event context, when core-imc counters are not even initialized. Thus creating the above stack dump. Add a check to see if core-imc counters are enabled or not in the cpuhotplug offline path before migrating the context to handle this failing scenario. Fixes: 885dcd70 ("powerpc/perf: Add nest IMC PMU support") Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
-
由 Linus Torvalds 提交于
For some odd historical reason, we preprocessed the linker scripts with "-C", which keeps comments around. That makes no sense, since the comments are not meaningful for the build anyway. And it actually breaks things, since linker scripts can't have C++ style "//" comments in them, so keeping comments after preprocessing now limits us in odd and surprising ways in our header files for no good reason. The -C option goes back to pre-git and pre-bitkeeper times, but seems to have been historically used (along with "-traditional") for some odd-ball architectures (ia64, MIPS and SH). It probably didn't matter back then either, but might possibly have been used to minimize the difference between the original file and the pre-processed result. The reason for this may be lost in time, but let's not perpetuate it only because we can't remember why we did this crazy thing. This was triggered by the recent addition of SPDX lines to the source tree, where people apparently were confused about why header files couldn't use the C++ comment format. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Linus Torvalds 提交于
This reverts commit 51204e06. There wasn't really any good reason for it, and people are complaining (rightly) that it broke existing practice. Cc: Len Brown <len.brown@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-