- 15 5月, 2017 1 次提交
-
-
由 Christoph Hellwig 提交于
Pass struct rpc_request as the first argument instead of an untyped blob, and mark the data object as const. Signed-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJeff Layton <jlayton@redhat.com>
-
- 10 5月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
syzkaller found a way to trigger double frees from ip_mc_drop_socket() It turns out that leave a copy of parent mc_list at accept() time, which is very bad. Very similar to commit 8b485ce6 ("tcp: do not inherit fastopen_req from parent") Initial report from Pray3r, completed by Andrey one. Thanks a lot to them ! Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NPray3r <pray3r.z@gmail.com> Reported-by: NAndrey Konovalov <andreyknvl@google.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 5月, 2017 14 次提交
-
-
由 Kees Cook 提交于
Instead of a direct cross-type cast, use conatiner_of() to locate the embedded structure, even in the face of future struct layout randomization. Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David S. Miller 提交于
This reverts commit 82486aa6. As implemented, this causes dangling netdevice refs. Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vlastimil Babka 提交于
We now have memalloc_noreclaim_{save,restore} helpers for robust setting and clearing of PF_MEMALLOC. Let's convert the code which was using the generic tsk_restore_flags(). No functional change. [vbabka@suse.cz: in net/core/sock.c the hunk is missing] Link: http://lkml.kernel.org/r/20170405074700.29871-4-vbabka@suse.czSigned-off-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Chris Leech <cleech@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Wouter Verhelst <w@uter.be> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Deepa Dinamani 提交于
CURRENT_TIME is not y2038 safe. The macro will be deleted and all the references to it will be replaced by ktime_get_* apis. struct timespec is also not y2038 safe. Retain timespec for timestamp representation here as ceph uses it internally everywhere. These references will be changed to use struct timespec64 in a separate patch. The current_fs_time() api is being changed to use vfs struct inode* as an argument instead of struct super_block*. Set the new mds client request r_stamp field using ktime_get_real_ts() instead of using current_fs_time(). Also, since r_stamp is used as mtime on the server, use timespec_trunc() to truncate the timestamp, using the right granularity from the superblock. This api will be transitioned to be y2038 safe along with vfs. Link: http://lkml.kernel.org/r/1491613030-11599-5-git-send-email-deepa.kernel@gmail.comSigned-off-by: NDeepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: NArnd Bergmann <arnd@arndb.de> M: Ilya Dryomov <idryomov@gmail.com> M: "Yan, Zheng" <zyan@redhat.com> M: Sage Weil <sage@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Kees Cook 提交于
While examining output from trial builds with -Wformat-security enabled, many strings were found that should be defined as "const", or as a char array instead of char pointer. This makes some static analysis easier, by producing fewer false positives. As these are all trivial changes, it seemed best to put them all in a single patch rather than chopping them up per maintainer. Link: http://lkml.kernel.org/r/20170405214711.GA5711@beastSigned-off-by: NKees Cook <keescook@chromium.org> Acked-by: Jes Sorensen <jes@trained-monkey.org> [runner.c] Cc: Tony Lindgren <tony@atomide.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Sean Paul <seanpaul@chromium.org> Cc: David Airlie <airlied@linux.ie> Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Cc: Salil Mehta <salil.mehta@huawei.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Jiri Slaby <jslaby@suse.com> Cc: Patrice Chotard <patrice.chotard@st.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Mugunthan V N <mugunthanvnm@ti.com> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Jarod Wilson <jarod@redhat.com> Cc: Florian Westphal <fw@strlen.de> Cc: Antonio Quartulli <a@unstable.cc> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Kejian Yan <yankejian@huawei.com> Cc: Daode Huang <huangdaode@hisilicon.com> Cc: Qianqian Xie <xieqianqian@huawei.com> Cc: Philippe Reynes <tremyfr@gmail.com> Cc: Colin Ian King <colin.king@canonical.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Christian Gromm <christian.gromm@microchip.com> Cc: Andrey Shvetsov <andrey.shvetsov@k2l.de> Cc: Jason Litzinger <jlitzingerdev@gmail.com> Cc: WANG Cong <xiyou.wangcong@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
__vmalloc* allows users to provide gfp flags for the underlying allocation. This API is quite popular $ git grep "=[[:space:]]__vmalloc\|return[[:space:]]*__vmalloc" | wc -l 77 The only problem is that many people are not aware that they really want to give __GFP_HIGHMEM along with other flags because there is really no reason to consume precious lowmemory on CONFIG_HIGHMEM systems for pages which are mapped to the kernel vmalloc space. About half of users don't use this flag, though. This signals that we make the API unnecessarily too complex. This patch simply uses __GFP_HIGHMEM implicitly when allocating pages to be mapped to the vmalloc space. Current users which add __GFP_HIGHMEM are simplified and drop the flag. Link: http://lkml.kernel.org/r/20170307141020.29107-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: NMatthew Wilcox <mawilcox@microsoft.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: Cristopher Lameter <cl@linux.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
fq_alloc_node, alloc_netdev_mqs and netif_alloc* open code kmalloc with vmalloc fallback. Use the kvmalloc variant instead. Keep the __GFP_REPEAT flag based on explanation from Eric: "At the time, tests on the hardware I had in my labs showed that vmalloc() could deliver pages spread all over the memory and that was a small penalty (once memory is fragmented enough, not at boot time)" The way how the code is constructed means, however, that we prefer to go and hit the OOM killer before we fall back to the vmalloc for requests <=32kB (with 4kB pages) in the current code. This is rather disruptive for something that can be achived with the fallback. On the other hand __GFP_REPEAT doesn't have any useful semantic for these requests. So the effect of this patch is that requests which fit into 32kB will fall back to vmalloc easier now. Link: http://lkml.kernel.org/r/20170306103327.2766-3-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Eric Dumazet <edumazet@google.com> Cc: David Miller <davem@davemloft.net> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
There are many code paths opencoding kvmalloc. Let's use the helper instead. The main difference to kvmalloc is that those users are usually not considering all the aspects of the memory allocator. E.g. allocation requests <= 32kB (with 4kB pages) are basically never failing and invoke OOM killer to satisfy the allocation. This sounds too disruptive for something that has a reasonable fallback - the vmalloc. On the other hand those requests might fallback to vmalloc even when the memory allocator would succeed after several more reclaim/compaction attempts previously. There is no guarantee something like that happens though. This patch converts many of those places to kv[mz]alloc* helpers because they are more conservative. Link: http://lkml.kernel.org/r/20170306103327.2766-2-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> # Xen bits Acked-by: NKees Cook <keescook@chromium.org> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: Andreas Dilger <andreas.dilger@intel.com> # Lustre Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> # KVM/s390 Acked-by: Dan Williams <dan.j.williams@intel.com> # nvdim Acked-by: David Sterba <dsterba@suse.com> # btrfs Acked-by: Ilya Dryomov <idryomov@gmail.com> # Ceph Acked-by: Tariq Toukan <tariqt@mellanox.com> # mlx4 Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx5 Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Tony Luck <tony.luck@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Santosh Raspatur <santosh@chelsio.com> Cc: Hariprasad S <hariprasad@chelsio.com> Cc: Yishai Hadas <yishaih@mellanox.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: "Yan, Zheng" <zyan@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Michal Hocko 提交于
alloc_ila_locks seemed to c&p from alloc_bucket_locks allocation pattern which is quite unusual. The default allocation size is 320 * sizeof(spinlock_t) which is sub page unless lockdep is enabled when the performance benefit is really questionable and not worth the subtle code IMHO. Also note that the context when we call ila_init_net (modprobe or a task creating a net namespace) has to be properly configured. Let's just simplify the code and use kvmalloc helper which is a transparent way to use kmalloc with vmalloc fallback. Link: http://lkml.kernel.org/r/20170306103032.2540-5-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com> Acked-by: NVlastimil Babka <vbabka@suse.cz> Cc: Tom Herbert <tom@herbertland.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 WANG Cong 提交于
For each netns (except init_net), we initialize its null entry in 3 places: 1) The template itself, as we use kmemdup() 2) Code around dst_init_metrics() in ip6_route_net_init() 3) ip6_route_dev_notify(), which is supposed to initialize it after loopback registers Unfortunately the last one still happens in a wrong order because we expect to initialize net->ipv6.ip6_null_entry->rt6i_idev to net->loopback_dev's idev, thus we have to do that after we add idev to loopback. However, this notifier has priority == 0 same as ipv6_dev_notf, and ipv6_dev_notf is registered after ip6_route_dev_notifier so it is called actually after ip6_route_dev_notifier. This is similar to commit 2f460933 ("ipv6: initialize route null entry in addrconf_init()") which fixes init_net. Fix it by picking a smaller priority for ip6_route_dev_notifier. Also, we have to release the refcnt accordingly when unregistering loopback_dev because device exit functions are called before subsys exit functions. Acked-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hangbin Liu 提交于
Signed-off-by: NHangbin Liu <liuhangbin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vlad Yasevich 提交于
Vlan devices, like all other software devices, enable NETIF_F_HW_CSUM feature. However, unlike all the othe other software devices, vlans will switch to using IP|IPV6_CSUM features, if the underlying devices uses them. In these situations, checksum offload features on the vlan device can't be controlled via ethtool. This patch makes vlans keep HW_CSUM feature if the underlying device supports checksum offloading. This makes vlan devices behave like other software devices, and restores control to the user. A side-effect is that some offload settings (typically UFO) may be enabled on the vlan device while being disabled on the HW. However, the GSO code will correctly process the packets. This actually results in slightly better raw throughput. Signed-off-by: NVladislav Yasevich <vyasevic@redhat.com> Acked-by: NAlexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Wang 提交于
Congestion control modules that want full control over congestion control behavior do not want the cwnd modifications controlled by the sysctl_tcp_slow_start_after_idle code path. So skip those code paths for CC modules that use the cong_control() API. As an example, those cwnd effects are not desired for the BBR congestion control algorithm. Fixes: c0402760 ("tcp: new CC hook to set sending rate with rate_sample in any CA state") Signed-off-by: NWei Wang <weiwan@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 WANG Cong 提交于
IPv4 dst could use fi->fib_metrics to store metrics but fib_info itself is refcnt'ed, so without taking a refcnt fi and fi->fib_metrics could be freed while dst metrics still points to it. This triggers use-after-free as reported by Andrey twice. This patch reverts commit 2860583f ("ipv4: Kill rt->fi") to restore this reference counting. It is a quick fix for -net and -stable, for -net-next, as Eric suggested, we can consider doing reference counting for metrics itself instead of relying on fib_info. IPv6 is very different, it copies or steals the metrics from mx6_config in fib6_commit_metrics() so probably doesn't need a refcnt. Decnet has already done the refcnt'ing, see dn_fib_semantic_match(). Fixes: 2860583f ("ipv4: Kill rt->fi") Reported-by: NAndrey Konovalov <andreyknvl@google.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 5月, 2017 3 次提交
-
-
由 Johannes Berg 提交于
When VHT IBSS support was added, the size of the extra elements wasn't considered in ieee80211_ibss_build_presp(), which makes it possible that it would overrun the allocated buffer. Fix it by allocating the necessary space. Fixes: abcff6ef ("mac80211: add VHT support for IBSS") Reported-by: NShaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Johannes Berg 提交于
Since groups 0 and 63 are invalid, we should check for those bits. Note that the 802.11 spec specifies the *bit* order, but the CPU doesn't care about bit order since it can't address bits, so it's always treating BIT(0) as the lowest bit within a byte. Reported-by: NJan Fuchs <jan.fuchs@lancom.de> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Luca Coelho 提交于
If ieee80211_hw_restart() is called during authentication, the authentication process will continue, causing the driver to be called in a wrong state. This ultimately causes an oops in the iwlwifi driver (at least). This fixes bugzilla 195299 partly. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195299Signed-off-by: NLuca Coelho <luciano.coelho@intel.com> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
- 06 5月, 2017 1 次提交
-
-
由 Eric Dumazet 提交于
Whole point of randomization was to hide server uptime, but an attacker can simply start a syn flood and TCP generates 'old style' timestamps, directly revealing server jiffies value. Also, TSval sent by the server to a particular remote address vary depending on syncookies being sent or not, potentially triggering PAWS drops for innocent clients. Lets implement proper randomization, including for SYNcookies. Also we do not need to export sysctl_tcp_timestamps, since it is not used from a module. In v2, I added Florian feedback and contribution, adding tsoff to tcp_get_cookie_sock(). v3 removed one unused variable in tcp_v4_connect() as Florian spotted. Fixes: 95a22cae ("tcp: randomize tcp timestamp offsets for each connection") Signed-off-by: NEric Dumazet <edumazet@google.com> Reviewed-by: NFlorian Westphal <fw@strlen.de> Tested-by: NFlorian Westphal <fw@strlen.de> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 5月, 2017 2 次提交
-
-
由 Tobias Klauser 提交于
The attribute sizes for IFLA_BRPORT_MCAST_FLOOD and IFLA_BRPORT_BCAST_FLOOD weren't accounted for in br_port_info_size() when they were added. Do so now and also add the corresponding policy entries: Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Cc: Mike Manning <mmanning@brocade.com> Fixes: b6cb5ac8 ("net: bridge: add per-port multicast flood flag") Fixes: 99f906e9 ("bridge: add per-port broadcast flood flag") Signed-off-by: NTobias Klauser <tklauser@distanz.ch> Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 WANG Cong 提交于
Andrey reported a crash on init_net.ipv6.ip6_null_entry->rt6i_idev since it is always NULL. This is clearly wrong, we have code to initialize it to loopback_dev, unfortunately the order is still not correct. loopback_dev is registered very early during boot, we lose a chance to re-initialize it in notifier. addrconf_init() is called after ip6_route_init(), which means we have no chance to correct it. Fix it by moving this initialization explicitly after ipv6_add_dev(init_net.loopback_dev) in addrconf_init(). Reported-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 5月, 2017 14 次提交
-
-
由 Michal Schmidt 提交于
IFLA_PHYS_PORT_NAME is a string attribute, so terminate it with \0. Otherwise libnl3 fails to validate netlink messages with this attribute. "ip -detail a" assumes too that the attribute is NUL-terminated when printing it. It often was, due to padding. I noticed this as libvirtd failing to start on a system with sfc driver after upgrading it to Linux 4.11, i.e. when sfc added support for phys_port_name. Signed-off-by: NMichal Schmidt <mschmidt@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Potapenko 提交于
raw_send_hdrinc() and rawv6_send_hdrinc() expect that the buffer copied from the userspace contains the IPv4/IPv6 header, so if too few bytes are copied, parts of the header may remain uninitialized. This bug has been detected with KMSAN. For the record, the KMSAN report: ================================================================== BUG: KMSAN: use of unitialized memory in nf_ct_frag6_gather+0xf5a/0x44a0 inter: 0 CPU: 0 PID: 1036 Comm: probe Not tainted 4.11.0-rc5+ #2455 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x143/0x1b0 lib/dump_stack.c:52 kmsan_report+0x16b/0x1e0 mm/kmsan/kmsan.c:1078 __kmsan_warning_32+0x5c/0xa0 mm/kmsan/kmsan_instr.c:510 nf_ct_frag6_gather+0xf5a/0x44a0 net/ipv6/netfilter/nf_conntrack_reasm.c:577 ipv6_defrag+0x1d9/0x280 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 nf_hook_entry_hookfn ./include/linux/netfilter.h:102 nf_hook_slow+0x13f/0x3c0 net/netfilter/core.c:310 nf_hook ./include/linux/netfilter.h:212 NF_HOOK ./include/linux/netfilter.h:255 rawv6_send_hdrinc net/ipv6/raw.c:673 rawv6_sendmsg+0x2fcb/0x41a0 net/ipv6/raw.c:919 inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696 SyS_sendto+0xbc/0xe0 net/socket.c:1664 do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285 entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246 RIP: 0033:0x436e03 RSP: 002b:00007ffce48baf38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000436e03 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007ffce48baf90 R08: 00007ffce48baf50 R09: 000000000000001c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000401790 R14: 0000000000401820 R15: 0000000000000000 origin: 00000000d9400053 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:362 kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:257 kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:270 slab_alloc_node mm/slub.c:2735 __kmalloc_node_track_caller+0x1f4/0x390 mm/slub.c:4341 __kmalloc_reserve net/core/skbuff.c:138 __alloc_skb+0x2cd/0x740 net/core/skbuff.c:231 alloc_skb ./include/linux/skbuff.h:933 alloc_skb_with_frags+0x209/0xbc0 net/core/skbuff.c:4678 sock_alloc_send_pskb+0x9ff/0xe00 net/core/sock.c:1903 sock_alloc_send_skb+0xe4/0x100 net/core/sock.c:1920 rawv6_send_hdrinc net/ipv6/raw.c:638 rawv6_sendmsg+0x2918/0x41a0 net/ipv6/raw.c:919 inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696 SyS_sendto+0xbc/0xe0 net/socket.c:1664 do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285 return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246 ================================================================== , triggered by the following syscalls: socket(PF_INET6, SOCK_RAW, IPPROTO_RAW) = 3 sendto(3, NULL, 0, 0, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "ff00::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EPERM A similar report is triggered in net/ipv4/raw.c if we use a PF_INET socket instead of a PF_INET6 one. Signed-off-by: NAlexander Potapenko <glider@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Colin Ian King 提交于
head is previously null checked and so the 2nd null check on head is redundant and therefore can be removed. Detected by CoverityScan, CID#1399505 ("Logically dead code") Signed-off-by: NColin Ian King <colin.king@canonical.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Under fuzzer stress, it is possible that a child gets a non NULL fastopen_req pointer from its parent at accept() time, when/if parent morphs from listener to active session. We need to make sure this can not happen, by clearing the field after socket cloning. BUG: Double free or freeing an invalid pointer Unexpected shadow byte: 0xFB CPU: 3 PID: 20933 Comm: syz-executor3 Not tainted 4.11.0+ #306 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x292/0x395 lib/dump_stack.c:52 kasan_object_err+0x1c/0x70 mm/kasan/report.c:164 kasan_report_double_free+0x5c/0x70 mm/kasan/report.c:185 kasan_slab_free+0x9d/0xc0 mm/kasan/kasan.c:580 slab_free_hook mm/slub.c:1357 [inline] slab_free_freelist_hook mm/slub.c:1379 [inline] slab_free mm/slub.c:2961 [inline] kfree+0xe8/0x2b0 mm/slub.c:3882 tcp_free_fastopen_req net/ipv4/tcp.c:1077 [inline] tcp_disconnect+0xc15/0x13e0 net/ipv4/tcp.c:2328 inet_child_forget+0xb8/0x600 net/ipv4/inet_connection_sock.c:898 inet_csk_reqsk_queue_add+0x1e7/0x250 net/ipv4/inet_connection_sock.c:928 tcp_get_cookie_sock+0x21a/0x510 net/ipv4/syncookies.c:217 cookie_v4_check+0x1a19/0x28b0 net/ipv4/syncookies.c:384 tcp_v4_cookie_check net/ipv4/tcp_ipv4.c:1384 [inline] tcp_v4_do_rcv+0x731/0x940 net/ipv4/tcp_ipv4.c:1421 tcp_v4_rcv+0x2dc0/0x31c0 net/ipv4/tcp_ipv4.c:1715 ip_local_deliver_finish+0x4cc/0xc20 net/ipv4/ip_input.c:216 NF_HOOK include/linux/netfilter.h:257 [inline] ip_local_deliver+0x1ce/0x700 net/ipv4/ip_input.c:257 dst_input include/net/dst.h:492 [inline] ip_rcv_finish+0xb1d/0x20b0 net/ipv4/ip_input.c:396 NF_HOOK include/linux/netfilter.h:257 [inline] ip_rcv+0xd8c/0x19c0 net/ipv4/ip_input.c:487 __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4210 __netif_receive_skb+0x2a/0x1a0 net/core/dev.c:4248 process_backlog+0xe5/0x6c0 net/core/dev.c:4868 napi_poll net/core/dev.c:5270 [inline] net_rx_action+0xe70/0x18e0 net/core/dev.c:5335 __do_softirq+0x2fb/0xb99 kernel/softirq.c:284 do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:899 </IRQ> do_softirq.part.17+0x1e8/0x230 kernel/softirq.c:328 do_softirq kernel/softirq.c:176 [inline] __local_bh_enable_ip+0x1cf/0x1e0 kernel/softirq.c:181 local_bh_enable include/linux/bottom_half.h:31 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:931 [inline] ip_finish_output2+0x9ab/0x15e0 net/ipv4/ip_output.c:230 ip_finish_output+0xa35/0xdf0 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:246 [inline] ip_output+0x1f6/0x7b0 net/ipv4/ip_output.c:404 dst_output include/net/dst.h:486 [inline] ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124 ip_queue_xmit+0x9a8/0x1a10 net/ipv4/ip_output.c:503 tcp_transmit_skb+0x1ade/0x3470 net/ipv4/tcp_output.c:1057 tcp_write_xmit+0x79e/0x55b0 net/ipv4/tcp_output.c:2265 __tcp_push_pending_frames+0xfa/0x3a0 net/ipv4/tcp_output.c:2450 tcp_push+0x4ee/0x780 net/ipv4/tcp.c:683 tcp_sendmsg+0x128d/0x39b0 net/ipv4/tcp.c:1342 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x660/0x810 net/socket.c:1696 SyS_sendto+0x40/0x50 net/socket.c:1664 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x446059 RSP: 002b:00007faa6761fb58 EFLAGS: 00000282 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 0000000000446059 RDX: 0000000000000001 RSI: 0000000020ba3fcd RDI: 0000000000000017 RBP: 00000000006e40a0 R08: 0000000020ba4ff0 R09: 0000000000000010 R10: 0000000020000000 R11: 0000000000000282 R12: 0000000000708150 R13: 0000000000000000 R14: 00007faa676209c0 R15: 00007faa67620700 Object at ffff88003b5bbcb8, in cache kmalloc-64 size: 64 Allocated: PID = 20909 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:513 set_track mm/kasan/kasan.c:525 [inline] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616 kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745 kmalloc include/linux/slab.h:490 [inline] kzalloc include/linux/slab.h:663 [inline] tcp_sendmsg_fastopen net/ipv4/tcp.c:1094 [inline] tcp_sendmsg+0x221a/0x39b0 net/ipv4/tcp.c:1139 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x660/0x810 net/socket.c:1696 SyS_sendto+0x40/0x50 net/socket.c:1664 entry_SYSCALL_64_fastpath+0x1f/0xbe Freed: PID = 20909 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:513 set_track mm/kasan/kasan.c:525 [inline] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589 slab_free_hook mm/slub.c:1357 [inline] slab_free_freelist_hook mm/slub.c:1379 [inline] slab_free mm/slub.c:2961 [inline] kfree+0xe8/0x2b0 mm/slub.c:3882 tcp_free_fastopen_req net/ipv4/tcp.c:1077 [inline] tcp_disconnect+0xc15/0x13e0 net/ipv4/tcp.c:2328 __inet_stream_connect+0x20c/0xf90 net/ipv4/af_inet.c:593 tcp_sendmsg_fastopen net/ipv4/tcp.c:1111 [inline] tcp_sendmsg+0x23a8/0x39b0 net/ipv4/tcp.c:1139 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x660/0x810 net/socket.c:1696 SyS_sendto+0x40/0x50 net/socket.c:1664 entry_SYSCALL_64_fastpath+0x1f/0xbe Fixes: e994b2f0 ("tcp: do not lock listener to process SYN packets") Fixes: 7db92362 ("tcp: fix potential double free issue for fastopen_req") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NAndrey Konovalov <andreyknvl@google.com> Acked-by: NWei Wang <weiwan@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ilya Dryomov 提交于
As we no longer release the lock before potentially raising BLACKLISTED in rbd_reregister_watch(), the "either locked or blacklisted" assert in rbd_queue_workfn() needs to go: we can be both locked and blacklisted at that point now. Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Reviewed-by: NJason Dillaman <dillaman@redhat.com>
-
由 Jeff Layton 提交于
Cephfs can get cap update requests that contain a new epoch barrier in them. When that happens we want to pause all OSD traffic until the right map epoch arrives. Add an epoch_barrier field to ceph_osd_client that is protected by the osdc->lock rwsem. When the barrier is set, and the current OSD map epoch is below that, pause the request target when submitting the request or when revisiting it. Add a way for upper layers (cephfs) to update the epoch_barrier as well. If we get a new map, compare the new epoch against the barrier before kicking requests and request another map if the map epoch is still lower than the one we want. If we get a map with a full pool, or at quota condition, then set the barrier to the current epoch value. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
When a Ceph volume hits capacity, a flag is set in the OSD map to indicate that, and a new map is sprayed around the cluster. With cephfs we want it to shut down any abortable requests that are in progress with an -ENOSPC error as they'd just hang otherwise. Add a new ceph_osdc_abort_on_full helper function to handle this. It will first check whether there is an out-of-space condition in the cluster and then walk the tree and abort any request that has r_abort_on_full set with a -ENOSPC error. Call this new function directly whenever we get a new OSD map. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
Usually, when the osd map is flagged as full or the pool is at quota, write requests just hang. This is not what we want for cephfs, where it would be better to simply report -ENOSPC back to userland instead of stalling. If the caller knows that it will want an immediate error return instead of blocking on a full or at-quota error condition then allow it to set a flag to request that behavior. Set that flag in ceph_osdc_new_request (since ceph.ko is the only caller), and on any other write request from ceph.ko. A later patch will deal with requests that were submitted before the new map showing the full condition came in. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Jeff Layton 提交于
Nothing uses this anymore with the removal of the ack vs. commit code. Remove the field and just encode zeroes into place in the request encoding. Signed-off-by: NJeff Layton <jlayton@redhat.com> Reviewed-by: NIlya Dryomov <idryomov@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Elena Reshetova 提交于
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com> Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NDavid Windsor <dwindsor@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Elena Reshetova 提交于
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com> Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NDavid Windsor <dwindsor@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Elena Reshetova 提交于
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: NElena Reshetova <elena.reshetova@intel.com> Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com> Signed-off-by: NKees Cook <keescook@chromium.org> Signed-off-by: NDavid Windsor <dwindsor@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
Add a readonly, exported to sysfs module parameter so that userspace can generate meaningful error messages. It's a bit funky, but there is no other libceph-specific place. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
由 Ilya Dryomov 提交于
No reason to hide CephFS-specific features in the rbd case. Recent feature bits mix RADOS and CephFS-specific stuff together anyway. Signed-off-by: NIlya Dryomov <idryomov@gmail.com>
-
- 03 5月, 2017 4 次提交
-
-
由 Geert Uytterhoeven 提交于
If gcc (e.g. 4.1.2) decides not to inline total_extension_size(), the build will fail with: net/built-in.o: In function `nf_conntrack_init_start': (.text+0x9baf6): undefined reference to `__compiletime_assert_1893' or ERROR: "__compiletime_assert_1893" [net/netfilter/nf_conntrack.ko] undefined! Fix this by forcing inlining of total_extension_size(). Fixes: b3a5db10 ("netfilter: conntrack: use u8 for extension sizes again") Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Andrey reported a warning triggered by the rcu code: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 5911 at lib/debugobjects.c:289 debug_print_object+0x175/0x210 ODEBUG: activate active (active state 1) object type: rcu_head hint: (null) Modules linked in: CPU: 1 PID: 5911 Comm: a.out Not tainted 4.11.0-rc8+ #271 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x192/0x22d lib/dump_stack.c:52 __warn+0x19f/0x1e0 kernel/panic.c:549 warn_slowpath_fmt+0xe0/0x120 kernel/panic.c:564 debug_print_object+0x175/0x210 lib/debugobjects.c:286 debug_object_activate+0x574/0x7e0 lib/debugobjects.c:442 debug_rcu_head_queue kernel/rcu/rcu.h:75 __call_rcu.constprop.76+0xff/0x9c0 kernel/rcu/tree.c:3229 call_rcu_sched+0x12/0x20 kernel/rcu/tree.c:3288 rt6_rcu_free net/ipv6/ip6_fib.c:158 rt6_release+0x1ea/0x290 net/ipv6/ip6_fib.c:188 fib6_del_route net/ipv6/ip6_fib.c:1461 fib6_del+0xa42/0xdc0 net/ipv6/ip6_fib.c:1500 __ip6_del_rt+0x100/0x160 net/ipv6/route.c:2174 ip6_del_rt+0x140/0x1b0 net/ipv6/route.c:2187 __ipv6_ifa_notify+0x269/0x780 net/ipv6/addrconf.c:5520 addrconf_ifdown+0xe60/0x1a20 net/ipv6/addrconf.c:3672 ... Andrey's reproducer program runs in a very tight loop, calling 'unshare -n' and then spawning 2 sets of 14 threads running random ioctl calls. The relevant networking sequence: 1. New network namespace created via unshare -n - ip6tnl0 device is created in down state 2. address added to ip6tnl0 - equivalent to ip -6 addr add dev ip6tnl0 fd00::bb/1 - DAD is started on the address and when it completes the host route is inserted into the FIB 3. ip6tnl0 is brought up - the new fixup_permanent_addr function restarts DAD on the address 4. exit namespace - teardown / cleanup sequence starts - once in a blue moon, lo teardown appears to happen BEFORE teardown of ip6tunl0 + down on 'lo' removes the host route from the FIB since the dst->dev for the route is loobback + host route added to rcu callback list * rcu callback has not run yet, so rt is NOT on the gc list so it has NOT been marked obsolete 5. in parallel to 4. worker_thread runs addrconf_dad_completed - DAD on the address on ip6tnl0 completes - calls ipv6_ifa_notify which inserts the host route All of that happens very quickly. The result is that a host route that has been deleted from the IPv6 FIB and added to the RCU list is re-inserted into the FIB. The exit namespace eventually gets to cleaning up ip6tnl0 which removes the host route from the FIB again, calls the rcu function for cleanup -- and triggers the double rcu trace. The root cause is duplicate DAD on the address -- steps 2 and 3. Arguably, DAD should not be started in step 2. The interface is in the down state, so it can not really send out requests for the address which makes starting DAD pointless. Since the second DAD was introduced by a recent change, seems appropriate to use it for the Fixes tag and have the fixup function only start DAD for addresses in the PREDAD state which occurs in addrconf_ifdown if the address is retained. Big thanks to Andrey for isolating a reliable reproducer for this problem. Fixes: f1705ec1 ("net: ipv6: Make address flushing on ifdown optional") Reported-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NDavid Ahern <dsahern@gmail.com> Tested-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pablo Neira Ayuso 提交于
If no NLM_F_EXCL is set and the element already exists in the set, make sure that both elements have the same extensions. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Michael S. Tsirkin 提交于
We are going to add more parameters to find_vqs, let's wrap the call so we don't need to tweak all drivers every time. Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
-