- 09 2月, 2018 1 次提交
-
-
由 David Howells 提交于
Don't put buffers of data to be handed to crypto on the stack as this may cause an assertion failure in the kernel (see below). Fix this by using an kmalloc'd buffer instead. kernel BUG at ./include/linux/scatterlist.h:147! ... RIP: 0010:rxkad_encrypt_response.isra.6+0x191/0x1b0 [rxrpc] RSP: 0018:ffffbe2fc06cfca8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff989277d59900 RCX: 0000000000000028 RDX: 0000259dc06cfd88 RSI: 0000000000000025 RDI: ffffbe30406cfd88 RBP: ffffbe2fc06cfd60 R08: ffffbe2fc06cfd08 R09: ffffbe2fc06cfd08 R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff7c5f80d9f95 R13: ffffbe2fc06cfd88 R14: ffff98927a3f7aa0 R15: ffffbe2fc06cfd08 FS: 0000000000000000(0000) GS:ffff98927fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b1ff28f0f8 CR3: 000000001b412003 CR4: 00000000003606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rxkad_respond_to_challenge+0x297/0x330 [rxrpc] rxrpc_process_connection+0xd1/0x690 [rxrpc] ? process_one_work+0x1c3/0x680 ? __lock_is_held+0x59/0xa0 process_one_work+0x249/0x680 worker_thread+0x3a/0x390 ? process_one_work+0x680/0x680 kthread+0x121/0x140 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x3a/0x50 Reported-by: NJonathan Billings <jsbillings@jsbillings.org> Reported-by: NMarc Dionne <marc.dionne@auristor.com> Signed-off-by: NDavid Howells <dhowells@redhat.com> Tested-by: NJonathan Billings <jsbillings@jsbillings.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 2月, 2018 5 次提交
-
-
由 Song Liu 提交于
tracepoint tcp_send_reset requires a full socket to work. However, it may be called when in TCP_TIME_WAIT: case TCP_TW_RST: tcp_v6_send_reset(sk, skb); inet_twsk_deschedule_put(inet_twsk(sk)); goto discard_it; To avoid this problem, this patch checks the socket with sk_fullsock() before calling trace_tcp_send_reset(). Fixes: c24b14c4 ("tcp: add tracepoint trace_tcp_send_reset") Signed-off-by: NSong Liu <songliubraving@fb.com> Reviewed-by: NLawrence Brakmo <brakmo@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Md. Islam 提交于
In Kernel 4.15.0+, Netem does not work properly. Netem setup: tc qdisc add dev h1-eth0 root handle 1: netem delay 10ms 2ms Result: PING 172.16.101.2 (172.16.101.2) 56(84) bytes of data. 64 bytes from 172.16.101.2: icmp_seq=1 ttl=64 time=22.8 ms 64 bytes from 172.16.101.2: icmp_seq=2 ttl=64 time=10.9 ms 64 bytes from 172.16.101.2: icmp_seq=3 ttl=64 time=10.9 ms 64 bytes from 172.16.101.2: icmp_seq=5 ttl=64 time=11.4 ms 64 bytes from 172.16.101.2: icmp_seq=6 ttl=64 time=11.8 ms 64 bytes from 172.16.101.2: icmp_seq=4 ttl=64 time=4303 ms 64 bytes from 172.16.101.2: icmp_seq=10 ttl=64 time=11.2 ms 64 bytes from 172.16.101.2: icmp_seq=11 ttl=64 time=10.3 ms 64 bytes from 172.16.101.2: icmp_seq=7 ttl=64 time=4304 ms 64 bytes from 172.16.101.2: icmp_seq=8 ttl=64 time=4303 ms Patch: (rnd % (2 * sigma)) - sigma was overflowing s32. After applying the patch, I found following output which is desirable. PING 172.16.101.2 (172.16.101.2) 56(84) bytes of data. 64 bytes from 172.16.101.2: icmp_seq=1 ttl=64 time=21.1 ms 64 bytes from 172.16.101.2: icmp_seq=2 ttl=64 time=8.46 ms 64 bytes from 172.16.101.2: icmp_seq=3 ttl=64 time=9.00 ms 64 bytes from 172.16.101.2: icmp_seq=4 ttl=64 time=11.8 ms 64 bytes from 172.16.101.2: icmp_seq=5 ttl=64 time=8.36 ms 64 bytes from 172.16.101.2: icmp_seq=6 ttl=64 time=11.8 ms 64 bytes from 172.16.101.2: icmp_seq=7 ttl=64 time=8.11 ms 64 bytes from 172.16.101.2: icmp_seq=8 ttl=64 time=10.0 ms 64 bytes from 172.16.101.2: icmp_seq=9 ttl=64 time=11.3 ms 64 bytes from 172.16.101.2: icmp_seq=10 ttl=64 time=11.5 ms 64 bytes from 172.16.101.2: icmp_seq=11 ttl=64 time=10.2 ms Reviewed-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Because of differences in how ipv4 and ipv6 handle fib lookups, verification of nexthops with onlink flag need to default to the main table rather than the local table used by IPv4. As it stands an address within a connected route on device 1 can be used with onlink on device 2. Updating the table properly rejects the route due to the egress device mismatch. Update the extack message as well to show it could be a device mismatch for the nexthop spec. Fixes: fc1e64e1 ("net/ipv6: Add support for onlink flag") Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Verification of nexthops with onlink flag need to handle unreachable routes. The lookup is only intended to validate the gateway address is not a local address and if the gateway resolves the egress device must match the given device. Hence, hitting any default reject route is ok. Fixes: fc1e64e1 ("net/ipv6: Add support for onlink flag") Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Howells 提交于
AF_RXRPC is incorrectly sending back to the server any abort it receives for a client connection. This is due to the final-ACK offload to the connection event processor patch. The abort code is copied into the last-call information on the connection channel and then the event processor is set. Instead, the following should be done: (1) In the case of a final-ACK for a successful call, the ACK should be scheduled as before. (2) In the case of a locally generated ABORT, the ABORT details should be cached for sending in response to further packets related to that call and no further action scheduled at call disconnect time. (3) In the case of an ACK received from the peer, the call should be considered dead, no ABORT should be transmitted at this time. In response to further non-ABORT packets from the peer relating to this call, an RX_USER_ABORT ABORT should be transmitted. (4) In the case of a call killed due to network error, an RX_USER_ABORT ABORT should be cached for transmission in response to further packets, but no ABORT should be sent at this time. Fixes: 3136ef49 ("rxrpc: Delay terminal ACK transmission on a client call") Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 2月, 2018 10 次提交
-
-
由 Felix Fietkau 提交于
flow_offload_del frees the flow, so all associated resource must be freed before. Since the ct entry in struct flow_offload_entry was allocated by flow_offload_alloc, it should be freed by flow_offload_free to take care of the error handling path when flow_offload_add fails. While at it, make flow_offload_del static, since it should never be called directly, only from the gc step Signed-off-by: NFelix Fietkau <nbd@nbd.name> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Masahiro Yamada 提交于
This macro is only used by net/ipv6/mcast.c, but there is no reason why it must be BUILD_BUG_ON_NULL(). Replace it with BUILD_BUG_ON_ZERO(), and remove BUILD_BUG_ON_NULL() definition from <linux/build_bug.h>. Link: http://lkml.kernel.org/r/1515121833-3174-3-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com> Cc: Ian Abbott <abbotti@mev.co.uk> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Yury Norov 提交于
with bitmap_{from,to}_arr32 over the kernel. Additionally to it: * __check_eq_bitmap() now takes single nbits argument. * __check_eq_u32_array is not used in new test but may be used in future. So I don't remove it here, but annotate as __used. Tested on arm64 and 32-bit BE mips. [arnd@arndb.de: perf: arm_dsu_pmu: convert to bitmap_from_arr32] Link: http://lkml.kernel.org/r/20180201172508.5739-2-ynorov@caviumnetworks.com [ynorov@caviumnetworks.com: fix net/core/ethtool.c] Link: http://lkml.kernel.org/r/20180205071747.4ekxtsbgxkj5b2fz@yury-thinkpad Link: http://lkml.kernel.org/r/20171228150019.27953-2-ynorov@caviumnetworks.comSigned-off-by: NYury Norov <ynorov@caviumnetworks.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: David Decotigny <decot@googlers.com>, Cc: David S. Miller <davem@davemloft.net>, Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
由 Pablo Neira Ayuso 提交于
Every flow_offload entry is added into the table twice. Because of this, rhashtable_free_and_destroy can't be used, since it would call kfree for each flow_offload object twice. This patch cleans up the flowtable via nf_flow_table_iterate() to schedule removal of entries by setting on the dying bit, then there is an explicitly invocation of the garbage collector to release resources. Based on patch from Felix Fietkau. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
Move the flowtable cleanup routines to nf_flow_table and expose the nf_flow_table_cleanup() helper function. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Cong Wang 提交于
rateest_hash is supposed to be protected by xt_rateest_mutex, and, as suggested by Eric, lookup and insert should be atomic, so we should acquire the xt_rateest_mutex once for both. So introduce a non-locking helper for internal use and keep the locking one for external. Reported-by: <syzbot+5cb189720978275e4c75@syzkaller.appspotmail.com> Fixes: 5859034d ("[NETFILTER]: x_tables: add RATEEST target") Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Reviewed-by: NFlorian Westphal <fw@strlen.de> Reviewed-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Guanglei Li 提交于
Scenario: 1. Port down and do fail over 2. Ap do rds_bind syscall PID: 47039 TASK: ffff89887e2fe640 CPU: 47 COMMAND: "kworker/u:6" #0 [ffff898e35f159f0] machine_kexec at ffffffff8103abf9 #1 [ffff898e35f15a60] crash_kexec at ffffffff810b96e3 #2 [ffff898e35f15b30] oops_end at ffffffff8150f518 #3 [ffff898e35f15b60] no_context at ffffffff8104854c #4 [ffff898e35f15ba0] __bad_area_nosemaphore at ffffffff81048675 #5 [ffff898e35f15bf0] bad_area_nosemaphore at ffffffff810487d3 #6 [ffff898e35f15c00] do_page_fault at ffffffff815120b8 #7 [ffff898e35f15d10] page_fault at ffffffff8150ea95 [exception RIP: unknown or invalid address] RIP: 0000000000000000 RSP: ffff898e35f15dc8 RFLAGS: 00010282 RAX: 00000000fffffffe RBX: ffff889b77f6fc00 RCX:ffffffff81c99d88 RDX: 0000000000000000 RSI: ffff896019ee08e8 RDI:ffff889b77f6fc00 RBP: ffff898e35f15df0 R8: ffff896019ee08c8 R9:0000000000000000 R10: 0000000000000400 R11: 0000000000000000 R12:ffff896019ee08c0 R13: ffff889b77f6fe68 R14: ffffffff81c99d80 R15: ffffffffa022a1e0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffff898e35f15dc8] cma_ndev_work_handler at ffffffffa022a228 [rdma_cm] #9 [ffff898e35f15df8] process_one_work at ffffffff8108a7c6 #10 [ffff898e35f15e58] worker_thread at ffffffff8108bda0 #11 [ffff898e35f15ee8] kthread at ffffffff81090fe6 PID: 45659 TASK: ffff880d313d2500 CPU: 31 COMMAND: "oracle_45659_ap" #0 [ffff881024ccfc98] __schedule at ffffffff8150bac4 #1 [ffff881024ccfd40] schedule at ffffffff8150c2cf #2 [ffff881024ccfd50] __mutex_lock_slowpath at ffffffff8150cee7 #3 [ffff881024ccfdc0] mutex_lock at ffffffff8150cdeb #4 [ffff881024ccfde0] rdma_destroy_id at ffffffffa022a027 [rdma_cm] #5 [ffff881024ccfe10] rds_ib_laddr_check at ffffffffa0357857 [rds_rdma] #6 [ffff881024ccfe50] rds_trans_get_preferred at ffffffffa0324c2a [rds] #7 [ffff881024ccfe80] rds_bind at ffffffffa031d690 [rds] #8 [ffff881024ccfeb0] sys_bind at ffffffff8142a670 PID: 45659 PID: 47039 rds_ib_laddr_check /* create id_priv with a null event_handler */ rdma_create_id rdma_bind_addr cma_acquire_dev /* add id_priv to cma_dev->id_list */ cma_attach_to_dev cma_ndev_work_handler /* event_hanlder is null */ id_priv->id.event_handler Signed-off-by: NGuanglei Li <guanglei.li@oracle.com> Signed-off-by: NHonglei Wang <honglei.wang@oracle.com> Reviewed-by: NJunxiao Bi <junxiao.bi@oracle.com> Reviewed-by: NYanjun Zhu <yanjun.zhu@oracle.com> Reviewed-by: NLeon Romanovsky <leonro@mellanox.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Acked-by: NDoug Ledford <dledford@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 William Tu 提交于
When an erspan tunnel device receives an erpsan packet with different tunnel metadata (ex: version, index, hwid, direction), existing code overwrites the tunnel device's erspan configuration with the received packet's metadata. The patch fixes it. Fixes: 1a66a836 ("gre: add collect_md mode to ERSPAN tunnel") Fixes: f551c91d ("net: erspan: introduce erspan v2 for ip_gre") Fixes: ef7baf5e ("ip6_gre: add ip6 erspan collect_md mode") Fixes: 94d7d8f2 ("ip6_gre: add erspan v2 support") Signed-off-by: NWilliam Tu <u9012063@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 William Tu 提交于
Commit d350a823 ("net: erspan: create erspan metadata uapi header") moves the erspan 'version' in front of the 'struct erspan_md2' for later extensibility reason. This breaks the existing erspan metadata extraction code because the erspan_md2 then has a 4-byte offset to between the erspan_metadata and erspan_base_hdr. This patch fixes it. Fixes: 1a66a836 ("gre: add collect_md mode to ERSPAN tunnel") Fixes: ef7baf5e ("ip6_gre: add ip6 erspan collect_md mode") Fixes: 1d7e2ed2 ("net: erspan: refactor existing erspan code") Signed-off-by: NWilliam Tu <u9012063@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Li Shuang reported an Oops with cls_u32 due to an use-after-free in u32_destroy_key(). The use-after-free can be triggered with: dev=lo tc qdisc add dev $dev root handle 1: htb default 10 tc filter add dev $dev parent 1: prio 5 handle 1: protocol ip u32 divisor 256 tc filter add dev $dev protocol ip parent 1: prio 5 u32 ht 800:: match ip dst\ 10.0.0.0/8 hashkey mask 0x0000ff00 at 16 link 1: tc qdisc del dev $dev root Which causes the following kasan splat: ================================================================== BUG: KASAN: use-after-free in u32_destroy_key.constprop.21+0x117/0x140 [cls_u32] Read of size 4 at addr ffff881b83dae618 by task kworker/u48:5/571 CPU: 17 PID: 571 Comm: kworker/u48:5 Not tainted 4.15.0+ #87 Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016 Workqueue: tc_filter_workqueue u32_delete_key_freepf_work [cls_u32] Call Trace: dump_stack+0xd6/0x182 ? dma_virt_map_sg+0x22e/0x22e print_address_description+0x73/0x290 kasan_report+0x277/0x360 ? u32_destroy_key.constprop.21+0x117/0x140 [cls_u32] u32_destroy_key.constprop.21+0x117/0x140 [cls_u32] u32_delete_key_freepf_work+0x1c/0x30 [cls_u32] process_one_work+0xae0/0x1c80 ? sched_clock+0x5/0x10 ? pwq_dec_nr_in_flight+0x3c0/0x3c0 ? _raw_spin_unlock_irq+0x29/0x40 ? trace_hardirqs_on_caller+0x381/0x570 ? _raw_spin_unlock_irq+0x29/0x40 ? finish_task_switch+0x1e5/0x760 ? finish_task_switch+0x208/0x760 ? preempt_notifier_dec+0x20/0x20 ? __schedule+0x839/0x1ee0 ? check_noncircular+0x20/0x20 ? firmware_map_remove+0x73/0x73 ? find_held_lock+0x39/0x1c0 ? worker_thread+0x434/0x1820 ? lock_contended+0xee0/0xee0 ? lock_release+0x1100/0x1100 ? init_rescuer.part.16+0x150/0x150 ? retint_kernel+0x10/0x10 worker_thread+0x216/0x1820 ? process_one_work+0x1c80/0x1c80 ? lock_acquire+0x1a5/0x540 ? lock_downgrade+0x6b0/0x6b0 ? sched_clock+0x5/0x10 ? lock_release+0x1100/0x1100 ? compat_start_thread+0x80/0x80 ? do_raw_spin_trylock+0x190/0x190 ? _raw_spin_unlock_irq+0x29/0x40 ? trace_hardirqs_on_caller+0x381/0x570 ? _raw_spin_unlock_irq+0x29/0x40 ? finish_task_switch+0x1e5/0x760 ? finish_task_switch+0x208/0x760 ? preempt_notifier_dec+0x20/0x20 ? __schedule+0x839/0x1ee0 ? kmem_cache_alloc_trace+0x143/0x320 ? firmware_map_remove+0x73/0x73 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x170 ? find_held_lock+0x39/0x1c0 ? schedule+0xf3/0x3b0 ? lock_downgrade+0x6b0/0x6b0 ? __schedule+0x1ee0/0x1ee0 ? do_wait_intr_irq+0x340/0x340 ? do_raw_spin_trylock+0x190/0x190 ? _raw_spin_unlock_irqrestore+0x32/0x60 ? process_one_work+0x1c80/0x1c80 ? process_one_work+0x1c80/0x1c80 kthread+0x312/0x3d0 ? kthread_create_worker_on_cpu+0xc0/0xc0 ret_from_fork+0x3a/0x50 Allocated by task 1688: kasan_kmalloc+0xa0/0xd0 __kmalloc+0x162/0x380 u32_change+0x1220/0x3c9e [cls_u32] tc_ctl_tfilter+0x1ba6/0x2f80 rtnetlink_rcv_msg+0x4f0/0x9d0 netlink_rcv_skb+0x124/0x320 netlink_unicast+0x430/0x600 netlink_sendmsg+0x8fa/0xd60 sock_sendmsg+0xb1/0xe0 ___sys_sendmsg+0x678/0x980 __sys_sendmsg+0xc4/0x210 do_syscall_64+0x232/0x7f0 return_from_SYSCALL_64+0x0/0x75 Freed by task 112: kasan_slab_free+0x71/0xc0 kfree+0x114/0x320 rcu_process_callbacks+0xc3f/0x1600 __do_softirq+0x2bf/0xc06 The buggy address belongs to the object at ffff881b83dae600 which belongs to the cache kmalloc-4096 of size 4096 The buggy address is located 24 bytes inside of 4096-byte region [ffff881b83dae600, ffff881b83daf600) The buggy address belongs to the page: page:ffffea006e0f6a00 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x17ffffc0008100(slab|head) raw: 0017ffffc0008100 0000000000000000 0000000000000000 0000000100070007 raw: dead000000000100 dead000000000200 ffff880187c0e600 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff881b83dae500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff881b83dae580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff881b83dae600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff881b83dae680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff881b83dae700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== The problem is that the htnode is freed before the linked knodes and the latter will try to access the first at u32_destroy_key() time. This change addresses the issue using the htnode refcnt to guarantee the correct free order. While at it also add a RCU annotation, to keep sparse happy. v1 -> v2: use rtnl_derefence() instead of RCU read locks v2 -> v3: - don't check refcnt in u32_destroy_hnode() - cleaned-up u32_destroy() implementation - cleaned-up code comment v3 -> v4: - dropped unneeded comment Reported-by: NLi Shuang <shuali@redhat.com> Fixes: c0d378ef ("net_sched: use tcf_queue_work() in u32 filter") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Acked-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 2月, 2018 2 次提交
-
-
由 Tommi Rantala 提交于
Fix dst reference count leak in sctp_v4_get_dst() introduced in commit 410f0383 ("sctp: add routing output fallback"): When walking the address_list, successive ip_route_output_key() calls may return the same rt->dst with the reference incremented on each call. The code would not decrement the dst refcount when the dst pointer was identical from the previous iteration, causing the dst refcnt leak. Testcase: ip netns add TEST ip netns exec TEST ip link set lo up ip link add dummy0 type dummy ip link add dummy1 type dummy ip link add dummy2 type dummy ip link set dev dummy0 netns TEST ip link set dev dummy1 netns TEST ip link set dev dummy2 netns TEST ip netns exec TEST ip addr add 192.168.1.1/24 dev dummy0 ip netns exec TEST ip link set dummy0 up ip netns exec TEST ip addr add 192.168.1.2/24 dev dummy1 ip netns exec TEST ip link set dummy1 up ip netns exec TEST ip addr add 192.168.1.3/24 dev dummy2 ip netns exec TEST ip link set dummy2 up ip netns exec TEST sctp_test -H 192.168.1.2 -P 20002 -h 192.168.1.1 -p 20000 -s -B 192.168.1.3 ip netns del TEST In 4.4 and 4.9 kernels this results to: [ 354.179591] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 364.419674] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 374.663664] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 384.903717] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 395.143724] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 405.383645] unregister_netdevice: waiting for lo to become free. Usage count = 1 ... Fixes: 410f0383 ("sctp: add routing output fallback") Fixes: 0ca50d12 ("sctp: fix src address selection if using secondary addresses") Signed-off-by: NTommi Rantala <tommi.t.rantala@nokia.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexey Kodanev 提交于
When going through the bind address list in sctp_v6_get_dst() and the previously found address is better ('matchlen > bmatchlen'), the code continues to the next iteration without releasing currently held destination. Fix it by releasing 'bdst' before continue to the next iteration, and instead of introducing one more '!IS_ERR(bdst)' check for dst_release(), move the already existed one right after ip6_dst_lookup_flow(), i.e. we shouldn't proceed further if we get an error for the route lookup. Fixes: dbc2b5e9 ("sctp: fix src address selection if using secondary addresses for ipv6") Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 2月, 2018 6 次提交
-
-
由 Roman Gushchin 提交于
This patch effectively reverts commit 9f1c2674 ("net: memcontrol: defer call to mem_cgroup_sk_alloc()"). Moving mem_cgroup_sk_alloc() to the inet_csk_accept() completely breaks memcg socket memory accounting, as packets received before memcg pointer initialization are not accounted and are causing refcounting underflow on socket release. Actually the free-after-use problem was fixed by commit c0576e39 ("net: call cgroup_sk_alloc() earlier in sk_clone_lock()") for the cgroup pointer. So, let's revert it and call mem_cgroup_sk_alloc() just before cgroup_sk_alloc(). This is safe, as we hold a reference to the socket we're cloning, and it holds a reference to the memcg. Also, let's drop BUG_ON(mem_cgroup_is_root()) check from mem_cgroup_sk_alloc(). I see no reasons why bumping the root memcg counter is a good reason to panic, and there are no realistic ways to hit it. Signed-off-by: NRoman Gushchin <guro@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
reuseport_add_sock() needs to deal with attaching a socket having its own sk_reuseport_cb, after a prior setsockopt(SO_ATTACH_REUSEPORT_?BPF) Without this fix, not only a WARN_ONCE() was issued, but we were also leaking memory. Thanks to sysbot and Eric Biggers for providing us nice C repros. ------------[ cut here ]------------ socket already in reuseport group WARNING: CPU: 0 PID: 3496 at net/core/sock_reuseport.c:119 reuseport_add_sock+0x742/0x9b0 net/core/sock_reuseport.c:117 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 3496 Comm: syzkaller869503 Not tainted 4.15.0-rc6+ #245 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1dc/0x200 kernel/panic.c:547 report_bug+0x211/0x2d0 lib/bug.c:184 fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 fixup_bug arch/x86/kernel/traps.c:247 [inline] do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:1079 Fixes: ef456144 ("soreuseport: define reuseport groups") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: syzbot+c0ea2226f77a42936bf7@syzkaller.appspotmail.com Acked-by: NCraig Gallek <kraig@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
In a couple of points of the control path, n->ht_down is currently accessed without the required RCU annotation. The accesses are safe, but sparse complaints. Since we already held the rtnl lock, let use rtnl_dereference(). Fixes: a1b7c5fd ("net: sched: add cls_u32 offload hooks for netdevs") Fixes: de5df632 ("net: sched: cls_u32 changes to knode must appear atomic to readers") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Acked-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pablo Neira Ayuso 提交于
nft_flow_offload module removal does not require to flush existing flowtables, it is valid to remove this module while keeping flowtables around. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Pablo Neira Ayuso 提交于
If netdevice goes down, then flowtable entries are scheduled to be removed. Wait for garbage collector to have a chance to run so it can delete them from the hashtable. The flush call might sleep, so hold the nfnl mutex from nft_flow_table_iterate() instead of rcu read side lock. The use of the nfnl mutex is also implicitly fixing races between updates via nfnetlink and netdevice event. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Cong Wang 提交于
xt_cgroup_info_v1->priv is an internal pointer only used for kernel, we should not trust what user-space provides. Reported-by: <syzbot+4fbcfcc0d2e6592bd641@syzkaller.appspotmail.com> Fixes: c38c4597 ("netfilter: implement xt_cgroup cgroup2 path match") Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 02 2月, 2018 4 次提交
-
-
由 Pablo Neira Ayuso 提交于
config NF_FLOW_TABLE depends on NETFILTER_INGRESS. If users forget to enable this toggle, flowtable registration fails with EOPNOTSUPP. Moreover, turn 'select NF_FLOW_TABLE' in every flowtable family flavour into dependency instead, otherwise this new dependency on NETFILTER_INGRESS causes a warning. This also allows us to remove the explicit dependency between family flowtables <-> NF_TABLES and NF_CONNTRACK, given they depend on the NF_FLOW_TABLE core that already expresses the general dependencies for this new infrastructure. Moreover, NF_FLOW_TABLE_INET depends on NF_FLOW_TABLE_IPV4 and NF_FLOWTABLE_IPV6, which already depends on NF_FLOW_TABLE. So we can get rid of direct dependency with NF_FLOW_TABLE. In general, let's avoid 'select', it just makes things more complicated. Reported-by: NJohn Crispin <john@phrozen.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
Failures were seen in ICMPv6 fragmentation timeout tests if they were run after the RFC2460 failure tests. Kernel was not sending out the ICMPv6 fragment reassembly time exceeded packet after the fragmentation reassembly timeout of 1 minute had elapsed. This happened because the frag queue was not released if an error in IPv6 fragmentation header was detected by RFC2460. Fixes: 83f1999c ("netfilter: ipv6: nf_defrag: Pass on packets to stack per RFC2460") Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Michal Hocko 提交于
syzbot has noticed that xt_alloc_table_info can allocate a lot of memory. This is an admin only interface but an admin in a namespace is sufficient as well. eacd86ca ("net/netfilter/x_tables.c: use kvmalloc() in xt_alloc_table_info()") has changed the opencoded kmalloc->vmalloc fallback into kvmalloc. It has dropped __GFP_NORETRY on the way because vmalloc has simply never fully supported __GFP_NORETRY semantic. This is still the case because e.g. page tables backing the vmalloc area are hardcoded GFP_KERNEL. Revert back to __GFP_NORETRY as a poors man defence against excessively large allocation request here. We will not rule out the OOM killer completely but __GFP_NORETRY should at least stop the large request in most cases. [akpm@linux-foundation.org: coding-style fixes] Fixes: eacd86ca ("net/netfilter/x_tables.c: use kvmalloc() in xt_alloc_tableLink: http://lkml.kernel.org/r/20180130140104.GE21609@dhcp22.suse.czSigned-off-by: NMichal Hocko <mhocko@suse.com> Acked-by: NFlorian Westphal <fw@strlen.de> Reviewed-by: NAndrew Morton <akpm@linux-foundation.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Eric Dumazet 提交于
Newly added igmpv3_get_srcaddr() needs to be called under rcu lock. Timer callbacks do not ensure this locking. ============================= WARNING: suspicious RCU usage 4.15.0+ #200 Not tainted ----------------------------- ./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by syzkaller616973/4074: #0: (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355 #1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline] #1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316 #2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline] #2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600 stack backtrace: CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592 __in_dev_get_rcu include/linux/inetdevice.h:216 [inline] igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline] igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389 add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432 add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565 igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605 igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722 igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831 call_timer_fn+0x228/0x820 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x7ee/0xb70 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2d7/0xb85 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:541 [inline] smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938 Fixes: a46182b0 ("net: igmp: Use correct source address on IGMPv3 reports") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 01 2月, 2018 7 次提交
-
-
由 Daniel Borkmann 提交于
syzkaller was able to generate the following XDP program ... (18) r0 = 0x0 (61) r5 = *(u32 *)(r1 +12) (04) (u32) r0 += (u32) 0 (95) exit ... and trigger a NULL pointer dereference in ___bpf_prog_run() via bpf_prog_test_run_xdp() where this was attempted to run. Reason is that recent xdp_rxq_info addition to XDP programs updated all drivers, but not bpf_prog_test_run_xdp(), where xdp_buff is set up. Thus when context rewriter does the deref on the netdev it's NULL at runtime. Fix it by using xdp_rxq from loopback dev. __netif_get_rx_queue() helper can also be reused in various other locations later on. Fixes: 02dd3291 ("bpf: finally expose xdp_rxq_info to XDP bpf-programs") Reported-by: syzbot+1eb094057b338eb1fc00@syzkaller.appspotmail.com Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: NJesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
-
由 Christian Brauner 提交于
RTM_NEWLINK supports the IFLA_IF_NETNSID property since 5bb8ed07 so we should not error out when it is passed. Signed-off-by: NChristian Brauner <christian.brauner@ubuntu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Geert Uytterhoeven 提交于
With gcc-4.1.2: net/ipv4/inet_hashtables.c: In function ‘inet_unhash’: net/ipv4/inet_hashtables.c:628: warning: ‘ilb’ may be used uninitialized in this function While this is a false positive, it can easily be avoided by using the pointer itself as the canary variable. Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Acked-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Geert Uytterhoeven 提交于
With gcc-4.1.2.: net/bridge/br_fdb.c: In function ‘br_fdb_sync_static’: net/bridge/br_fdb.c:996: warning: ‘err’ may be used uninitialized in this function Indeed, if the list is empty, err will be uninitialized, and will be propagated up as the function return value. Fix this by preinitializing err to zero. Fixes: eb793583 ("net: bridge: use rhashtable for fdbs") Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org> Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ed Swierk 提交于
IPv4 and IPv6 packets may arrive with lower-layer padding that is not included in the L3 length. For example, a short IPv4 packet may have up to 6 bytes of padding following the IP payload when received on an Ethernet device with a minimum packet length of 64 bytes. Higher-layer processing functions in netfilter (e.g. nf_ip_checksum(), and help() in nf_conntrack_ftp) assume skb->len reflects the length of the L3 header and payload, rather than referring back to ip_hdr->tot_len or ipv6_hdr->payload_len, and get confused by lower-layer padding. In the normal IPv4 receive path, ip_rcv() trims the packet to ip_hdr->tot_len before invoking netfilter hooks. In the IPv6 receive path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly in the br_netfilter receive path, br_validate_ipv4() and br_validate_ipv6() trim the packet to the L3 length before invoking netfilter hooks. Currently in the OVS conntrack receive path, ovs_ct_execute() pulls the skb to the L3 header but does not trim it to the L3 length before calling nf_conntrack_in(NF_INET_PRE_ROUTING). When nf_conntrack_proto_tcp encounters a packet with lower-layer padding, nf_ip_checksum() fails causing a "nf_ct_tcp: bad TCP checksum" log message. While extra zero bytes don't affect the checksum, the length in the IP pseudoheader does. That length is based on skb->len, and without trimming, it doesn't match the length the sender used when computing the checksum. In ovs_ct_execute(), trim the skb to the L3 length before higher-layer processing. Signed-off-by: NEd Swierk <eswierk@skyportsystems.com> Acked-by: NPravin B Shelar <pshelar@ovn.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Neal Cardwell 提交于
This commit fixes the pacing_gain to remain at BBR_UNIT (1.0) when using lt_bw and returning from the PROBE_RTT state to PROBE_BW. Previously, when using lt_bw, upon exiting PROBE_RTT and entering PROBE_BW the bbr_reset_probe_bw_mode() code could sometimes randomly end up with a cycle_idx of 0 and hence have bbr_advance_cycle_phase() set a pacing gain above 1.0. In such cases this would result in a pacing rate that is 1.25x higher than intended, potentially resulting in a high loss rate for a little while until we stop using the lt_bw a bit later. This commit is a stable candidate for kernels back as far as 4.9. Fixes: 0f8782ea ("tcp_bbr: add BBR congestion control") Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com> Reported-by: NBeyers Cronje <bcronje@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Axtens 提交于
If you take a GSO skb, and split it into packets, will the MAC length (L2 + L3 + L4 headers + payload) of those packets be small enough to fit within a given length? Move skb_gso_mac_seglen() to skbuff.h with other related functions like skb_gso_network_seglen() so we can use it, and then create skb_gso_validate_mac_len to do the full calculation. Signed-off-by: NDaniel Axtens <dja@axtens.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 31 1月, 2018 5 次提交
-
-
由 Paolo Abeni 提交于
Syzbot reported several deadlocks in the netfilter area caused by rtnl lock and socket lock being acquired with a different order on different code paths, leading to backtraces like the following one: ====================================================== WARNING: possible circular locking dependency detected 4.15.0-rc9+ #212 Not tainted ------------------------------------------------------ syzkaller041579/3682 is trying to acquire lock: (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>] lock_sock include/net/sock.h:1463 [inline] (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>] do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167 but task is already holding lock: (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (rtnl_mutex){+.+.}: __mutex_lock_common kernel/locking/mutex.c:756 [inline] __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74 register_netdevice_notifier+0xad/0x860 net/core/dev.c:1607 tee_tg_check+0x1a0/0x280 net/netfilter/xt_TEE.c:106 xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:845 check_target net/ipv6/netfilter/ip6_tables.c:538 [inline] find_check_entry.isra.7+0x935/0xcf0 net/ipv6/netfilter/ip6_tables.c:580 translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:749 do_replace net/ipv6/netfilter/ip6_tables.c:1165 [inline] do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1691 nf_sockopt net/netfilter/nf_sockopt.c:106 [inline] nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115 ipv6_setsockopt+0x115/0x150 net/ipv6/ipv6_sockglue.c:928 udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422 sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978 SYSC_setsockopt net/socket.c:1849 [inline] SyS_setsockopt+0x189/0x360 net/socket.c:1828 entry_SYSCALL_64_fastpath+0x29/0xa0 -> #0 (sk_lock-AF_INET6){+.+.}: lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3914 lock_sock_nested+0xc2/0x110 net/core/sock.c:2780 lock_sock include/net/sock.h:1463 [inline] do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167 ipv6_setsockopt+0xd7/0x150 net/ipv6/ipv6_sockglue.c:922 udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422 sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978 SYSC_setsockopt net/socket.c:1849 [inline] SyS_setsockopt+0x189/0x360 net/socket.c:1828 entry_SYSCALL_64_fastpath+0x29/0xa0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(rtnl_mutex); lock(sk_lock-AF_INET6); lock(rtnl_mutex); lock(sk_lock-AF_INET6); *** DEADLOCK *** 1 lock held by syzkaller041579/3682: #0: (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74 The problem, as Florian noted, is that nf_setsockopt() is always called with the socket held, even if the lock itself is required only for very tight scopes and only for some operation. This patch addresses the issues moving the lock_sock() call only where really needed, namely in ipv*_getorigdst(), so that nf_setsockopt() does not need anymore to acquire both locks. Fixes: 22265a5c ("netfilter: xt_TEE: resolve oif using netdevice notifiers") Reported-by: syzbot+a4c2dc980ac1af699b36@syzkaller.appspotmail.com Suggested-by: NFlorian Westphal <fw@strlen.de> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
由 Vakul Garg 提交于
Async crypto accelerators (e.g. drivers/crypto/caam) support offloading GCM operation. If they are enabled, crypto_aead_encrypt() return error code -EINPROGRESS. In this case tls_do_encryption() needs to wait on a completion till the time the response for crypto offload request is received. Signed-off-by: NVakul Garg <vakul.garg@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nikolay Aleksandrov 提交于
When we dump the ip6mr mfc entries via proc, we initialize an iterator with the table to dump but we don't clear the cache pointer which might be initialized from a prior read on the same descriptor that ended. This can result in lock imbalance (an unnecessary unlock) leading to other crashes and hangs. Clear the cache pointer like ipmr does to fix the issue. Thanks for the reliable reproducer. Here's syzbot's trace: WARNING: bad unlock balance detected! 4.15.0-rc3+ #128 Not tainted syzkaller971460/3195 is trying to release lock (mrt_lock) at: [<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553 but there are no more locks to release! other info that might help us debug this: 1 lock held by syzkaller971460/3195: #0: (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0 fs/seq_file.c:165 stack backtrace: CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561 __lock_release kernel/locking/lockdep.c:3775 [inline] lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023 __raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline] _raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255 ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553 traverse+0x3bc/0xa00 fs/seq_file.c:135 seq_read+0x96a/0x13d0 fs/seq_file.c:189 proc_reg_read+0xef/0x170 fs/proc/inode.c:217 do_loop_readv_writev fs/read_write.c:673 [inline] do_iter_read+0x3db/0x5b0 fs/read_write.c:897 compat_readv+0x1bf/0x270 fs/read_write.c:1140 do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189 C_SYSC_preadv fs/read_write.c:1209 [inline] compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203 do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline] do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389 entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125 RIP: 0023:0xf7f73c79 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 BUG: sleeping function called from invalid context at lib/usercopy.c:25 in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460 INFO: lockdep is turned off. CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060 __might_sleep+0x95/0x190 kernel/sched/core.c:6013 __might_fault+0xab/0x1d0 mm/memory.c:4525 _copy_to_user+0x2c/0xc0 lib/usercopy.c:25 copy_to_user include/linux/uaccess.h:155 [inline] seq_read+0xcb4/0x13d0 fs/seq_file.c:279 proc_reg_read+0xef/0x170 fs/proc/inode.c:217 do_loop_readv_writev fs/read_write.c:673 [inline] do_iter_read+0x3db/0x5b0 fs/read_write.c:897 compat_readv+0x1bf/0x270 fs/read_write.c:1140 do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189 C_SYSC_preadv fs/read_write.c:1209 [inline] compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203 do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline] do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389 entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125 RIP: 0023:0xf7f73c79 RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0 RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0 lib/usercopy.c:26 Reported-by: Nsyzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669@syzkaller.appspotmail.com> Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ulf Magnusson 提交于
Blank help texts are probably either a typo, a Kconfig misunderstanding, or some kind of half-committing to adding a help text (in which case a TODO comment would be clearer, if the help text really can't be added right away). Best to remove them, IMO. Signed-off-by: NUlf Magnusson <ulfalizer@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gustavo A. R. Silva 提交于
Add suffix LL to constant 1000 in order to give the compiler complete information about the proper arithmetic to use. Notice that this constant is used in a context that expects an expression of type long long int (64 bits, signed). The expression (band->burst_size + band->rate) * 1000 is currently being evaluated using 32-bit arithmetic. Addresses-Coverity-ID: 1461563 ("Unintentional integer overflow") Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-