- 11 5月, 2018 1 次提交
-
-
由 David Howells 提交于
Add a tracepoint to log received ICMP/ICMP6 events and other error messages. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 08 5月, 2018 1 次提交
-
-
由 Wolfram Sang 提交于
Signed-off-by: NWolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 5月, 2018 1 次提交
-
-
由 Bhadram Varka 提交于
It adds support for BCM89610 (Single-Port 10/100/1000BASE-T) transceiver which is used in P3310 Tegra186 platform. Signed-off-by: NBhadram Varka <vbhadram@nvidia.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 5月, 2018 1 次提交
-
-
由 Ido Schimmel 提交于
This reverts commit edd7ceb7 ("ipv6: Allow non-gateway ECMP for IPv6"). Eric reported a division by zero in rt6_multipath_rebalance() which is caused by above commit that considers identical local routes to be siblings. The division by zero happens because a nexthop weight is not set for local routes. Revert the commit as it does not fix a bug and has side effects. To reproduce: # ip -6 address add 2001:db8::1/64 dev dummy0 # ip -6 address add 2001:db8::1/64 dev dummy1 Fixes: edd7ceb7 ("ipv6: Allow non-gateway ECMP for IPv6") Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Tested-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 5月, 2018 4 次提交
-
-
由 Michael S. Tsirkin 提交于
This reverts commit 93c0d549c4c5a7382ad70de6b86610b7aae57406. Unfortunately the padding will break 32 bit userspace. Ouch. Need to add some compat code, revert for now. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Dave Watson 提交于
It is reported that in some cases, write_space may be called in do_tcp_sendpages, such that we recursively invoke do_tcp_sendpages again: [ 660.468802] ? do_tcp_sendpages+0x8d/0x580 [ 660.468826] ? tls_push_sg+0x74/0x130 [tls] [ 660.468852] ? tls_push_record+0x24a/0x390 [tls] [ 660.468880] ? tls_write_space+0x6a/0x80 [tls] ... tls_push_sg already does a loop over all sending sg's, so ignore any tls_write_space notifications until we are done sending. We then have to call the previous write_space to wake up poll() waiters after we are done with the send loop. Reported-by: NAndre Tomt <andre@tomt.net> Signed-off-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Winter 提交于
It is valid to have static routes where the nexthop is an interface not an address such as tunnels. For IPv4 it was possible to use ECMP on these routes but not for IPv6. Signed-off-by: NThomas Winter <Thomas.Winter@alliedtelesis.co.nz> Cc: David Ahern <dsahern@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Acked-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Michael S. Tsirkin 提交于
There's a 32 bit hole just after type. It's best to give it a name, this way compiler is forced to initialize it with rest of the structure. Reported-by: NKevin Easton <kevin@guarana.org> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 4月, 2018 1 次提交
-
-
由 Amir Goldstein 提交于
The comment claims that this helper will try not to loose bits, but for 64bit long it looses the high bits before hashing 64bit long into 32bit int. Use the helper hash_long() to do the right thing for 64bit long. For 32bit long, there is no change. All the callers of end_name_hash() either assign the result to qstr->hash, which is u32 or return the result as an int value (e.g. full_name_hash()). Change the helper return type to int to conform to its users. [ It took me a while to apply this, because my initial reaction to it was - incorrectly - that it could make for slower code. After having looked more at it, I take back all my complaints about the patch, Amir was right and I was mis-reading things or just being stupid. I also don't worry too much about the possible performance impact of this on 64-bit, since most architectures that actually care about performance end up not using this very much (the dcache code is the most performance-critical, but the word-at-a-time case uses its own hashing anyway). So this ends up being mostly used for filesystems that do their own degraded hashing (usually because they want a case-insensitive comparison function). A _tiny_ worry remains, in that not everybody uses DCACHE_WORD_ACCESS, and then this potentially makes things more expensive on 64-bit architectures with slow or lacking multipliers even for the normal case. That said, realistically the only such architecture I can think of is PA-RISC. Nobody really cares about performance on that, it's more of a "look ma, I've got warts^W an odd machine" platform. So the patch is fine, and all my initial worries were just misplaced from not looking at this properly. - Linus ] Signed-off-by: NAmir Goldstein <amir73il@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 4月, 2018 1 次提交
-
-
由 KarimAllah Ahmed 提交于
Move DISABLE_EXITS KVM capability bits to the UAPI just like the rest of capabilities. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NKarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
- 27 4月, 2018 2 次提交
-
-
由 Israel Rukshin 提交于
Adding the vector offset when calling to mlx5_vector2eqn() is wrong. This is because mlx5_vector2eqn() checks if EQ index is equal to vector number and the fact that the internal completion vectors that mlx5 allocates don't get an EQ index. The second problem here is that using effective_affinity_mask gives the same CPU for different vectors. This leads to unmapped queues when calling it from blk_mq_rdma_map_queues(). This doesn't happen when using affinity_hint mask. Fixes: 2572cf57 ("mlx5: fix mlx5_get_vector_affinity to start from completion vector 0") Fixes: 05e0cc84 ("net/mlx5: Fix get vector affinity helper function") Signed-off-by: NIsrael Rukshin <israelr@mellanox.com> Reviewed-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me>
-
由 Rishabh Bhatnagar 提交于
Using initcall_t in the __field macro generates the following warning with clang version 6.0: include/trace/events/initcall.h:34:3: warning: ordered comparison of function pointers ('initcall_t' (aka 'int (*)(void)') and 'initcall_t') __field macro expands to __field_ext macro which does is_signed_type check on the type argument. Since initcall_t is defined as a function pointer, using it as the type in the __field macro, leads to an ordered comparison of function pointer warning, inside the check. Using __field_struct macro avoids the issue. Link: http://lkml.kernel.org/r/1524699755-29388-1-git-send-email-rishabhb@codeaurora.orgSigned-off-by: NRishabh Bhatnagar <rishabhb@codeaurora.org> [ Added comment to why we are using field_struct() ] Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
- 26 4月, 2018 2 次提交
-
-
由 Thomas Gleixner 提交于
Revert commits 92af4dcb ("tracing: Unify the "boot" and "mono" tracing clocks") 127bfa5f ("hrtimer: Unify MONOTONIC and BOOTTIME clock behavior") 7250a404 ("posix-timers: Unify MONOTONIC and BOOTTIME clock behavior") d6c7270e ("timekeeping: Remove boot time specific code") f2d6fdbf ("Input: Evdev - unify MONOTONIC and BOOTTIME clock behavior") d6ed449a ("timekeeping: Make the MONOTONIC clock behave like the BOOTTIME clock") 72199320 ("timekeeping: Add the new CLOCK_MONOTONIC_ACTIVE clock") As stated in the pull request for the unification of CLOCK_MONOTONIC and CLOCK_BOOTTIME, it was clear that we might have to revert the change. As reported by several folks systemd and other applications rely on the documented behaviour of CLOCK_MONOTONIC on Linux and break with the above changes. After resume daemons time out and other timeout related issues are observed. Rafael compiled this list: * systemd kills daemons on resume, after >WatchdogSec seconds of suspending (Genki Sky). [Verified that that's because systemd uses CLOCK_MONOTONIC and expects it to not include the suspend time.] * systemd-journald misbehaves after resume: systemd-journald[7266]: File /var/log/journal/016627c3c4784cd4812d4b7e96a34226/system.journal corrupted or uncleanly shut down, renaming and replacing. (Mike Galbraith). * NetworkManager reports "networking disabled" and networking is broken after resume 50% of the time (Pavel). [May be because of systemd.] * MATE desktop dims the display and starts the screensaver right after system resume (Pavel). * Full system hang during resume (me). [May be due to systemd or NM or both.] That happens on debian and open suse systems. It's sad, that these problems were neither catched in -next nor by those folks who expressed interest in this change. Reported-by: NRafael J. Wysocki <rjw@rjwysocki.net> Reported-by: Genki Sky <sky@genki.is>, Reported-by: NPavel Machek <pavel@ucw.cz> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kevin Easton <kevin@guarana.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salyzyn <salyzyn@android.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org>
-
由 Michael S. Tsirkin 提交于
For cleanup it's helpful to be able to simply scan all vqs and discard all data. Add an iterator to do that. Cc: stable@vger.kernel.org Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-
- 25 4月, 2018 4 次提交
-
-
由 Linus Walleij 提交于
As it came up in discussion on the mailing list that the semantic meaning of 'blk_mq_ctx' and 'blk_mq_hw_ctx' isn't completely obvious to everyone, let's add some minimal kerneldoc for a starter. Signed-off-by: NLinus Walleij <linus.walleij@linaro.org> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Takashi Iwai 提交于
As recently Smatch suggested, a few places in ALSA control core codes may expand the array directly from the user-space value with speculation: sound/core/control.c:1003 snd_ctl_elem_lock() warn: potential spectre issue 'kctl->vd' sound/core/control.c:1031 snd_ctl_elem_unlock() warn: potential spectre issue 'kctl->vd' sound/core/control.c:844 snd_ctl_elem_info() warn: potential spectre issue 'kctl->vd' sound/core/control.c:891 snd_ctl_elem_read() warn: potential spectre issue 'kctl->vd' sound/core/control.c:939 snd_ctl_elem_write() warn: potential spectre issue 'kctl->vd' Although all these seem doing only the first load without further reference, we may want to stay in a safer side, so hardening with array_index_nospec() would still make sense. In this patch, we put array_index_nospec() to the common snd_ctl_get_ioff*() helpers instead of each caller. These helpers are also referred from some drivers, too, and basically all usages are to calculate the array index from the user-space value, hence it's better to cover there. BugLink: https://marc.info/?l=linux-kernel&m=152411496503418&w=2Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: NTakashi Iwai <tiwai@suse.de>
-
由 Michael S. Tsirkin 提交于
Jason Wang points out that it's very hard for users to build an array of stat names. The naive thing is to use VIRTIO_BALLOON_S_NR but that breaks if we add more stats - as done e.g. recently by commit 6c64fe7f ("virtio_balloon: export hugetlb page allocation counts"). Let's add an array of reasonably readable names. Fixes: 6c64fe7f ("virtio_balloon: export hugetlb page allocation counts") Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: NMichael S. Tsirkin <mst@redhat.com> Reviewed-by: NJonathan Helman <jonathan.helman@oracle.com>
-
由 Florian Fainelli 提交于
While adding support for ethtool::get_fecparam and set_fecparam, kernel doc for these functions was missed, add those. Fixes: 1a5f3da2 ("net: ethtool: add support for forward error correction modes") Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 4月, 2018 3 次提交
-
-
由 Joakim Tjernlund 提交于
Currently it is possible to read and/or write to suspend EB's. Writing /dev/mtdX or /dev/mtdblockX from several processes may break the flash state machine. Signed-off-by: NJoakim Tjernlund <joakim.tjernlund@infinera.com> Cc: <stable@vger.kernel.org> Reviewed-by: NRichard Weinberger <richard@nod.at> Signed-off-by: NBoris Brezillon <boris.brezillon@bootlin.com>
-
由 John Fastabend 提交于
Relying on map_release hook to decrement the reference counts when a map is removed only works if the map is not being pinned. In the pinned case the ref is decremented immediately and the BPF programs released. After this BPF programs may not be in-use which is not what the user would expect. This patch moves the release logic into bpf_map_put_uref() and brings sockmap in-line with how a similar case is handled in prog array maps. Fixes: 3d9e9526 ("bpf: sockmap, fix leaking maps with attached but not detached progs") Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Roman Gushchin 提交于
Running bpf programs requires disabled preemption, however at least some* of the BPF_PROG_RUN_ARRAY users do not follow this rule. To fix this bug, and also to make it not happen in the future, let's add explicit preemption disabling/re-enabling to the __BPF_PROG_RUN_ARRAY code. * for example: [ 17.624472] RIP: 0010:__cgroup_bpf_run_filter_sk+0x1c4/0x1d0 ... [ 17.640890] inet6_create+0x3eb/0x520 [ 17.641405] __sock_create+0x242/0x340 [ 17.641939] __sys_socket+0x57/0xe0 [ 17.642370] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 17.642944] SyS_socket+0xa/0x10 [ 17.643357] do_syscall_64+0x79/0x220 [ 17.643879] entry_SYSCALL_64_after_hwframe+0x42/0xb7 Signed-off-by: NRoman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Acked-by: NAlexei Starovoitov <ast@kernel.org> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
- 23 4月, 2018 5 次提交
-
-
由 Hans de Goede 提交于
Move the declarations of functions from vboxguest_utils.c which are only meant for vboxguest internal use from include/linux/vbox_utils.h to drivers/virt/vboxguest/vboxguest_core.h. Cc: stable@vger.kernel.org Signed-off-by: NHans de Goede <hdegoede@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Tetsuo Handa 提交于
syzbot is reporting kernel panic [1] triggered by memory allocation failure at tty_ldisc_get() from tty_ldisc_init(). But since both tty_ldisc_get() and caller of tty_ldisc_init() can cleanly handle errors, tty_ldisc_init() does not need to call panic() when tty_ldisc_get() failed. [1] https://syzkaller.appspot.com/bug?id=883431818e036ae6a9981156a64b821110f39187Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jslaby@suse.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Daniel Kurtz 提交于
Commit 99492c39 ("earlycon: Fix __earlycon_table stride") tried to fix __earlycon_table stride by forcing the earlycon_id struct alignment to 32 and asking the linker to 32-byte align the __earlycon_table symbol. This fix was based on commit 07fca0e5 ("tracing: Properly align linker defined symbols") which tried a similar fix for the tracing subsystem. However, this fix doesn't quite work because there is no guarantee that gcc will place structures packed into an array format. In fact, gcc 4.9 chooses to 64-byte align these structs by inserting additional padding between the entries because it has no clue that they are supposed to be in an array. If we are unlucky, the linker will assign symbol "__earlycon_table" to a 32-byte aligned address which does not correspond to the 64-byte aligned contents of section "__earlycon_table". To address this same problem, the fix to the tracing system was subsequently re-implemented using a more robust table of pointers approach by commits: 3d56e331 ("tracing: Replace syscall_meta_data struct array with pointer array") 65498646 ("tracepoints: Fix section alignment using pointer array") e4a9ea5e ("tracing: Replace trace_event struct array with pointer array") Let's use this same "array of pointers to structs" approach for EARLYCON_TABLE. Fixes: 99492c39 ("earlycon: Fix __earlycon_table stride") Signed-off-by: NDaniel Kurtz <djkurtz@chromium.org> Suggested-by: NAaron Durbin <adurbin@chromium.org> Reviewed-by: NRob Herring <robh@kernel.org> Tested-by: NGuenter Roeck <groeck@chromium.org> Reviewed-by: NGuenter Roeck <groeck@chromium.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Alexander Aring 提交于
There is currently no handling to check on a invalid tlv length. This patch adds such handling to avoid killing the kernel with a malformed ife packet. Signed-off-by: NAlexander Aring <aring@mojatatu.com> Reviewed-by: NYotam Gigi <yotam.gi@gmail.com> Acked-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
The connection timers of an llc sock could be still flying after we delete them in llc_sk_free(), and even possibly after we free the sock. We could just wait synchronously here in case of troubles. Note, I leave other call paths as they are, since they may not have to wait, at least we can change them to synchronously when needed. Also, move the code to net/llc/llc_conn.c, which is apparently a better place. Reported-by: <syzbot+f922284c18ea23a8e457@syzkaller.appspotmail.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 4月, 2018 3 次提交
-
-
由 Andrey Konovalov 提交于
KASAN uses the __no_sanitize_address macro to disable instrumentation of particular functions. Right now it's defined only for GCC build, which causes false positives when clang is used. This patch adds a definition for clang. Note, that clang's revision 329612 or higher is required. [andreyknvl@google.com: remove redundant #ifdef CONFIG_KASAN check] Link: http://lkml.kernel.org/r/c79aa31a2a2790f6131ed607c58b0dd45dd62a6c.1523967959.git.andreyknvl@google.com Link: http://lkml.kernel.org/r/4ad725cc903f8534f8c8a60f0daade5e3d674f8d.1523554166.git.andreyknvl@google.comSigned-off-by: NAndrey Konovalov <andreyknvl@google.com> Acked-by: NAndrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Paul Lawrence <paullawrence@google.com> Cc: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Greg Thelen 提交于
lock_page_memcg()/unlock_page_memcg() use spin_lock_irqsave/restore() if the page's memcg is undergoing move accounting, which occurs when a process leaves its memcg for a new one that has memory.move_charge_at_immigrate set. unlocked_inode_to_wb_begin,end() use spin_lock_irq/spin_unlock_irq() if the given inode is switching writeback domains. Switches occur when enough writes are issued from a new domain. This existing pattern is thus suspicious: lock_page_memcg(page); unlocked_inode_to_wb_begin(inode, &locked); ... unlocked_inode_to_wb_end(inode, locked); unlock_page_memcg(page); If both inode switch and process memcg migration are both in-flight then unlocked_inode_to_wb_end() will unconditionally enable interrupts while still holding the lock_page_memcg() irq spinlock. This suggests the possibility of deadlock if an interrupt occurs before unlock_page_memcg(). truncate __cancel_dirty_page lock_page_memcg unlocked_inode_to_wb_begin unlocked_inode_to_wb_end <interrupts mistakenly enabled> <interrupt> end_page_writeback test_clear_page_writeback lock_page_memcg <deadlock> unlock_page_memcg Due to configuration limitations this deadlock is not currently possible because we don't mix cgroup writeback (a cgroupv2 feature) and memory.move_charge_at_immigrate (a cgroupv1 feature). If the kernel is hacked to always claim inode switching and memcg moving_account, then this script triggers lockup in less than a minute: cd /mnt/cgroup/memory mkdir a b echo 1 > a/memory.move_charge_at_immigrate echo 1 > b/memory.move_charge_at_immigrate ( echo $BASHPID > a/cgroup.procs while true; do dd if=/dev/zero of=/mnt/big bs=1M count=256 done ) & while true; do sync done & sleep 1h & SLEEP=$! while true; do echo $SLEEP > a/cgroup.procs echo $SLEEP > b/cgroup.procs done The deadlock does not seem possible, so it's debatable if there's any reason to modify the kernel. I suggest we should to prevent future surprises. And Wang Long said "this deadlock occurs three times in our environment", so there's more reason to apply this, even to stable. Stable 4.4 has minor conflicts applying this patch. For a clean 4.4 patch see "[PATCH for-4.4] writeback: safer lock nesting" https://lkml.org/lkml/2018/4/11/146 Wang Long said "this deadlock occurs three times in our environment" [gthelen@google.com: v4] Link: http://lkml.kernel.org/r/20180411084653.254724-1-gthelen@google.com [akpm@linux-foundation.org: comment tweaks, struct initialization simplification] Change-Id: Ibb773e8045852978f6207074491d262f1b3fb613 Link: http://lkml.kernel.org/r/20180410005908.167976-1-gthelen@google.com Fixes: 682aa8e1 ("writeback: implement unlocked_inode_to_wb transaction and use it for stat updates") Signed-off-by: NGreg Thelen <gthelen@google.com> Reported-by: NWang Long <wanglong19@meituan.com> Acked-by: NWang Long <wanglong19@meituan.com> Acked-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: <stable@vger.kernel.org> [v4.2+] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kees Cook 提交于
One of the classes of kernel stack content leaks[1] is exposing the contents of prior heap or stack contents when a new process stack is allocated. Normally, those stacks are not zeroed, and the old contents remain in place. In the face of stack content exposure flaws, those contents can leak to userspace. Fixing this will make the kernel no longer vulnerable to these flaws, as the stack will be wiped each time a stack is assigned to a new process. There's not a meaningful change in runtime performance; it almost looks like it provides a benefit. Performing back-to-back kernel builds before: Run times: 157.86 157.09 158.90 160.94 160.80 Mean: 159.12 Std Dev: 1.54 and after: Run times: 159.31 157.34 156.71 158.15 160.81 Mean: 158.46 Std Dev: 1.46 Instead of making this a build or runtime config, Andy Lutomirski recommended this just be enabled by default. [1] A noisy search for many kinds of stack content leaks can be seen here: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=linux+kernel+stack+leak I did some more with perf and cycle counts on running 100,000 execs of /bin/true. before: Cycles: 218858861551 218853036130 214727610969 227656844122 224980542841 Mean: 221015379122.60 Std Dev: 4662486552.47 after: Cycles: 213868945060 213119275204 211820169456 224426673259 225489986348 Mean: 217745009865.40 Std Dev: 5935559279.99 It continues to look like it's faster, though the deviation is rather wide, but I'm not sure what I could do that would be less noisy. I'm open to ideas! Link: http://lkml.kernel.org/r/20180221021659.GA37073@beastSigned-off-by: NKees Cook <keescook@chromium.org> Acked-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 4月, 2018 2 次提交
-
-
由 Marc Zyngier 提交于
Although we've implemented PSCI 0.1, 0.2 and 1.0, we expose either 0.1 or 1.0 to a guest, defaulting to the latest version of the PSCI implementation that is compatible with the requested version. This is no different from doing a firmware upgrade on KVM. But in order to give a chance to hypothetical badly implemented guests that would have a fit by discovering something other than PSCI 0.2, let's provide a new API that allows userspace to pick one particular version of the API. This is implemented as a new class of "firmware" registers, where we expose the PSCI version. This allows the PSCI version to be save/restored as part of a guest migration, and also set to any supported version if the guest requires it. Cc: stable@vger.kernel.org #4.16 Reviewed-by: NChristoffer Dall <cdall@kernel.org> Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
-
由 Robert Kolchmeyer 提交于
fsnotify() acquires a reference to a fsnotify_mark_connector through the SRCU-protected pointer to_tell->i_fsnotify_marks. However, it appears that no precautions are taken in fsnotify_put_mark() to ensure that fsnotify() drops its reference to this fsnotify_mark_connector before assigning a value to its 'destroy_next' field. This can result in fsnotify_put_mark() assigning a value to a connector's 'destroy_next' field right before fsnotify() tries to traverse the linked list referenced by the connector's 'list' field. Since these two fields are members of the same union, this behavior results in a kernel panic. This issue is resolved by moving the connector's 'destroy_next' field into the object pointer union. This should work since the object pointer access is protected by both a spinlock and the value of the 'flags' field, and the 'flags' field is cleared while holding the spinlock in fsnotify_put_mark() before 'destroy_next' is updated. It shouldn't be possible for another thread to accidentally read from the object pointer after the 'destroy_next' field is updated. The offending behavior here is extremely unlikely; since fsnotify_put_mark() removes references to a connector (specifically, it ensures that the connector is unreachable from the inode it was formerly attached to) before updating its 'destroy_next' field, a sizeable chunk of code in fsnotify_put_mark() has to execute in the short window between when fsnotify() acquires the connector reference and saves the value of its 'list' field. On the HEAD kernel, I've only been able to reproduce this by inserting a udelay(1) in fsnotify(). However, I've been able to reproduce this issue without inserting a udelay(1) anywhere on older unmodified release kernels, so I believe it's worth fixing at HEAD. References: https://bugzilla.kernel.org/show_bug.cgi?id=199437 Fixes: 08991e83 CC: stable@vger.kernel.org Signed-off-by: NRobert Kolchmeyer <rkolchmeyer@google.com> Signed-off-by: NJan Kara <jack@suse.cz>
-
- 19 4月, 2018 7 次提交
-
-
由 Mathieu Poirier 提交于
Move CoreSight headers to the SPDX identifier. Signed-off-by: NMathieu Poirier <mathieu.poirier@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1524089118-27595-1-git-send-email-mathieu.poirier@linaro.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
-
由 Ahbong Chang 提交于
Without this forward declaration compile may fail if this header is included only for registering other probe event without struct pool_workqueue. Link: http://lkml.kernel.org/r/20180416023626.139915-1-cwahbong@google.comReviewed-by: NTodd Poynor <toddpoynor@google.com> Signed-off-by: NAhbong Chang <cwahbong@google.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
-
由 Arend van Spriel 提交于
Upon submitting a patch for mwifiex [1] it was discussed whether this callback function could fail. To keep things simple there is no need for the error code so the driver can do the task synchronous or not without worries. Currently the device driver core already ignores the return value so changing it to void. [1] https://patchwork.kernel.org/patch/10231933/Signed-off-by: NArend van Spriel <aspriel@gmail.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Bart Van Assche 提交于
Since SCSI scanning occurs asynchronously, since sd_revalidate_disk() is called from sd_probe_async() and since sd_revalidate_disk() calls sd_zbc_read_zones() it can happen that sd_zbc_read_zones() is called concurrently with blkdev_report_zones() and/or blkdev_reset_zones(). That can cause these functions to fail with -EIO because sd_zbc_read_zones() e.g. sets q->nr_zones to zero before restoring it to the actual value, even if no drive characteristics have changed. Avoid that this can happen by making the following changes: - Protect the code that updates zone information with blk_queue_enter() and blk_queue_exit(). - Modify sd_zbc_setup_seq_zones_bitmap() and sd_zbc_setup() such that these functions do not modify struct scsi_disk before all zone information has been obtained. Note: since commit 055f6e18 ("block: Make q_usage_counter also track legacy requests"; kernel v4.15) the request queue freezing mechanism also affects legacy request queues. Fixes: 89d94756 ("sd: Implement support for ZBC devices") Signed-off-by: NBart Van Assche <bart.vanassche@wdc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: stable@vger.kernel.org # v4.16 Reviewed-by: NDamien Le Moal <damien.lemoal@wdc.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Ohad Sharabi 提交于
Add UFS Protocol Information Units(upiu) trace events for ufs driver, used to trace various ufs transaction types- command, task-management and device management. The trace-point format is generic and can be easily adapted to trace other upius if needed. Currently tracing ufs transaction of type 'device management', which this patch introduce, cannot be obtained from any other trace. Device management transactions are used for communication with the device such as reading and writing descriptor or attributes etc. Signed-off-by: NOhad Sharabi <ohad.sharabi@sandisk.com> Reviewed-by: NStanislav Nijnikov <stanislav.nijnikov@wdc.com> Reviewed-by: NBart Van Assche <bart.vanassche@wdc.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 John Pittman 提交于
In commit 21045519 ("scsi: use per-cpu buffer for formatting sense"), function scsi_show_extd_sense() was removed, switching use over to scsi_format_extd_sense(). Remove last reference to scsi_show_extd_sense() in include/scsi/scsi_dbg.h. Signed-off-by: NJohn Pittman <jpittman@redhat.com> Reviewed-by: NBart Van Assche <bart.vanassche@wdc.com> Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
-
由 Dave Gerlach 提交于
The sleep33xx and sleep43xx files should not depend on a header file generated in drivers/memory. Remove this dependency and instead allow both drivers/memory and arch/arm/mach-omap2 to generate all macros needed in headers local to their own paths. This fixes an issue where the build fail will when using O= to set a split object directory and arch/arm/mach-omap2 is built before drivers/memory with the following error: .../drivers/memory/emif-asm-offsets.c:1:0: fatal error: can't open drivers/memory/emif-asm-offsets.s for writing: No such file or directory compilation terminated. Fixes: 41d9d44d ("ARM: OMAP2+: pm33xx-core: Add platform code needed for PM") Reviewed-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: NDave Gerlach <d-gerlach@ti.com> Acked-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NTony Lindgren <tony@atomide.com>
-
- 18 4月, 2018 2 次提交
-
-
由 Dave Chinner 提交于
So we can check FUA support status from the iomap direct IO code. Reviewed-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDave Chinner <dchinner@redhat.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
由 Toshiaki Makita 提交于
Syzkaller spotted an old bug which leads to reading skb beyond tail by 4 bytes on vlan tagged packets. This is caused because skb_vlan_tagged_multi() did not check skb_headlen. BUG: KMSAN: uninit-value in eth_type_vlan include/linux/if_vlan.h:283 [inline] BUG: KMSAN: uninit-value in skb_vlan_tagged_multi include/linux/if_vlan.h:656 [inline] BUG: KMSAN: uninit-value in vlan_features_check include/linux/if_vlan.h:672 [inline] BUG: KMSAN: uninit-value in dflt_features_check net/core/dev.c:2949 [inline] BUG: KMSAN: uninit-value in netif_skb_features+0xd1b/0xdc0 net/core/dev.c:3009 CPU: 1 PID: 3582 Comm: syzkaller435149 Not tainted 4.16.0+ #82 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 eth_type_vlan include/linux/if_vlan.h:283 [inline] skb_vlan_tagged_multi include/linux/if_vlan.h:656 [inline] vlan_features_check include/linux/if_vlan.h:672 [inline] dflt_features_check net/core/dev.c:2949 [inline] netif_skb_features+0xd1b/0xdc0 net/core/dev.c:3009 validate_xmit_skb+0x89/0x1320 net/core/dev.c:3084 __dev_queue_xmit+0x1cb2/0x2b60 net/core/dev.c:3549 dev_queue_xmit+0x4b/0x60 net/core/dev.c:3590 packet_snd net/packet/af_packet.c:2944 [inline] packet_sendmsg+0x7c57/0x8a10 net/packet/af_packet.c:2969 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] sock_write_iter+0x3b9/0x470 net/socket.c:909 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776 do_iter_write+0x30d/0xd40 fs/read_write.c:932 vfs_writev fs/read_write.c:977 [inline] do_writev+0x3c9/0x830 fs/read_write.c:1012 SYSC_writev+0x9b/0xb0 fs/read_write.c:1085 SyS_writev+0x56/0x80 fs/read_write.c:1082 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x43ffa9 RSP: 002b:00007fff2cff3948 EFLAGS: 00000217 ORIG_RAX: 0000000000000014 RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043ffa9 RDX: 0000000000000001 RSI: 0000000020000080 RDI: 0000000000000003 RBP: 00000000006cb018 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000217 R12: 00000000004018d0 R13: 0000000000401960 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321 slab_post_alloc_hook mm/slab.h:445 [inline] slab_alloc_node mm/slub.c:2737 [inline] __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:984 [inline] alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234 sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085 packet_alloc_skb net/packet/af_packet.c:2803 [inline] packet_snd net/packet/af_packet.c:2894 [inline] packet_sendmsg+0x6444/0x8a10 net/packet/af_packet.c:2969 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] sock_write_iter+0x3b9/0x470 net/socket.c:909 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776 do_iter_write+0x30d/0xd40 fs/read_write.c:932 vfs_writev fs/read_write.c:977 [inline] do_writev+0x3c9/0x830 fs/read_write.c:1012 SYSC_writev+0x9b/0xb0 fs/read_write.c:1085 SyS_writev+0x56/0x80 fs/read_write.c:1082 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: 58e998c6 ("offloading: Force software GSO for multiple vlan tags.") Reported-and-tested-by: syzbot+0bbe42c764feafa82c5a@syzkaller.appspotmail.com Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-