- 19 9月, 2019 9 次提交
-
-
由 Neal Cardwell 提交于
[ Upstream commit af38d07ed391b21f7405fa1f936ca9686787d6d2 ] Fix tcp_ecn_withdraw_cwr() to clear the correct bit: TCP_ECN_QUEUE_CWR. Rationale: basically, TCP_ECN_DEMAND_CWR is a bit that is purely about the behavior of data receivers, and deciding whether to reflect incoming IP ECN CE marks as outgoing TCP th->ece marks. The TCP_ECN_QUEUE_CWR bit is purely about the behavior of data senders, and deciding whether to send CWR. The tcp_ecn_withdraw_cwr() function is only called from tcp_undo_cwnd_reduction() by data senders during an undo, so it should zero the sender-side state, TCP_ECN_QUEUE_CWR. It does not make sense to stop the reflection of incoming CE bits on incoming data packets just because outgoing packets were spuriously retransmitted. The bug has been reproduced with packetdrill to manifest in a scenario with RFC3168 ECN, with an incoming data packet with CE bit set and carrying a TCP timestamp value that causes cwnd undo. Before this fix, the IP CE bit was ignored and not reflected in the TCP ECE header bit, and sender sent a TCP CWR ('W') bit on the next outgoing data packet, even though the cwnd reduction had been undone. After this fix, the sender properly reflects the CE bit and does not set the W bit. Note: the bug actually predates 2005 git history; this Fixes footer is chosen to be the oldest SHA1 I have tested (from Sep 2007) for which the patch applies cleanly (since before this commit the code was in a .h file). Fixes: bdf1ee5d ("[TCP]: Move code from tcp_ecn.h to tcp*.c and tcp.h & remove it") Signed-off-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NYuchung Cheng <ycheng@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Xin Long 提交于
[ Upstream commit 10eb56c582c557c629271f1ee31e15e7a9b2558b ] Transport should use its own pf_retrans to do the error_count check, instead of asoc's. Otherwise, it's meaningless to make pf_retrans per transport. Fixes: 5aa93bcf ("sctp: Implement quick failover draft from tsvwg") Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Christophe JAILLET 提交于
[ Upstream commit b456d72412ca8797234449c25815e82f4e1426c0 ] The '.exit' functions from 'pernet_operations' structure should be marked as __net_exit, not __net_init. Fixes: 8e2d61e0 ("sctp: fix race on protocol/netns initialization") Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Cong Wang 提交于
[ Upstream commit d4d6ec6dac07f263f06d847d6f732d6855522845 ] In case of TCA_HHF_NON_HH_WEIGHT or TCA_HHF_QUANTUM is zero, it would make no progress inside the loop in hhf_dequeue() thus kernel would get stuck. Fix this by checking this corner case in hhf_change(). Fixes: 10239edf ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc") Reported-by: syzbot+bc6297c11f19ee807dc2@syzkaller.appspotmail.com Reported-by: syzbot+041483004a7f45f1f20a@syzkaller.appspotmail.com Reported-by: syzbot+55be5f513bed37fc4367@syzkaller.appspotmail.com Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Terry Lam <vtlam@google.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Eric Dumazet 提交于
[ Upstream commit b88dd52c62bb5c5d58f0963287f41fd084352c57 ] Whenever MQ is not used on a multiqueue device, we experience serious reordering problems. Bisection found the cited commit. The issue can be described this way : - A single qdisc hierarchy is shared by all transmit queues. (eg : tc qdisc replace dev eth0 root fq_codel) - When/if try_bulk_dequeue_skb_slow() dequeues a packet targetting a different transmit queue than the one used to build a packet train, we stop building the current list and save the 'bad' skb (P1) in a special queue. (bad_txq) - When dequeue_skb() calls qdisc_dequeue_skb_bad_txq() and finds this skb (P1), it checks if the associated transmit queues is still in frozen state. If the queue is still blocked (by BQL or NIC tx ring full), we leave the skb in bad_txq and return NULL. - dequeue_skb() calls q->dequeue() to get another packet (P2) The other packet can target the problematic queue (that we found in frozen state for the bad_txq packet), but another cpu just ran TX completion and made room in the txq that is now ready to accept new packets. - Packet P2 is sent while P1 is still held in bad_txq, P1 might be sent at next round. In practice P2 is the lead of a big packet train (P2,P3,P4 ...) filling the BQL budget and delaying P1 by many packets :/ To solve this problem, we have to block the dequeue process as long as the first packet in bad_txq can not be sent. Reordering issues disappear and no side effects have been seen. Fixes: a53851e2 ("net: sched: explicit locking in gso_cpu fallback") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: John Fastabend <john.fastabend@gmail.com> Acked-by: NJohn Fastabend <john.fastabend@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Shmulik Ladkani 提交于
[ Upstream commit 3dcbdb134f329842a38f0e6797191b885ab00a00 ] Historically, support for frag_list packets entering skb_segment() was limited to frag_list members terminating on exact same gso_size boundaries. This is verified with a BUG_ON since commit 89319d38 ("net: Add frag_list support to skb_segment"), quote: As such we require all frag_list members terminate on exact MSS boundaries. This is checked using BUG_ON. As there should only be one producer in the kernel of such packets, namely GRO, this requirement should not be difficult to maintain. However, since commit 6578171a ("bpf: add bpf_skb_change_proto helper"), the "exact MSS boundaries" assumption no longer holds: An eBPF program using bpf_skb_change_proto() DOES modify 'gso_size', but leaves the frag_list members as originally merged by GRO with the original 'gso_size'. Example of such programs are bpf-based NAT46 or NAT64. This lead to a kernel BUG_ON for flows involving: - GRO generating a frag_list skb - bpf program performing bpf_skb_change_proto() or bpf_skb_adjust_room() - skb_segment() of the skb See example BUG_ON reports in [0]. In commit 13acc94e ("net: permit skb_segment on head_frag frag_list skb"), skb_segment() was modified to support the "gso_size mangling" case of a frag_list GRO'ed skb, but *only* for frag_list members having head_frag==true (having a page-fragment head). Alas, GRO packets having frag_list members with a linear kmalloced head (head_frag==false) still hit the BUG_ON. This commit adds support to skb_segment() for a 'head_skb' packet having a frag_list whose members are *non* head_frag, with gso_size mangled, by disabling SG and thus falling-back to copying the data from the given 'head_skb' into the generated segmented skbs - as suggested by Willem de Bruijn [1]. Since this approach involves the penalty of skb_copy_and_csum_bits() when building the segments, care was taken in order to enable this solution only when required: - untrusted gso_size, by testing SKB_GSO_DODGY is set (SKB_GSO_DODGY is set by any gso_size mangling functions in net/core/filter.c) - the frag_list is non empty, its item is a non head_frag, *and* the headlen of the given 'head_skb' does not match the gso_size. [0] https://lore.kernel.org/netdev/20190826170724.25ff616f@pixies/ https://lore.kernel.org/netdev/9265b93f-253d-6b8c-f2b8-4b54eff1835c@fb.com/ [1] https://lore.kernel.org/netdev/CA+FuTSfVsgNDi7c=GUU8nMg2hWxF2SjCNLXetHeVPdnxAW5K-w@mail.gmail.com/ Fixes: 6578171a ("bpf: add bpf_skb_change_proto helper") Suggested-by: NWillem de Bruijn <willemdebruijn.kernel@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com> Reviewed-by: NWillem de Bruijn <willemb@google.com> Reviewed-by: NAlexander Duyck <alexander.h.duyck@linux.intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
[ Upstream commit 10cc514f451a0f239aa34f91bc9dc954a9397840 ] In event of failure during register_netdevice, free_netdev is invoked immediately. free_netdev assumes that all the netdevice refcounts have been dropped prior to it being called and as a result frees and clears out the refcount pointer. However, this is not necessarily true as some of the operations in the NETDEV_UNREGISTER notifier handlers queue RCU callbacks for invocation after a grace period. The IPv4 callback in_dev_rcu_put tries to access the refcount after free_netdev is called which leads to a null de-reference- 44837.761523: <6> Unable to handle kernel paging request at virtual address 0000004a88287000 44837.761651: <2> pc : in_dev_finish_destroy+0x4c/0xc8 44837.761654: <2> lr : in_dev_finish_destroy+0x2c/0xc8 44837.762393: <2> Call trace: 44837.762398: <2> in_dev_finish_destroy+0x4c/0xc8 44837.762404: <2> in_dev_rcu_put+0x24/0x30 44837.762412: <2> rcu_nocb_kthread+0x43c/0x468 44837.762418: <2> kthread+0x118/0x128 44837.762424: <2> ret_from_fork+0x10/0x1c Fix this by waiting for the completion of the call_rcu() in case of register_netdevice errors. Fixes: 93ee31f1 ("[NET]: Fix free_netdev on register_netdev failure.") Cc: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: NSubash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Christophe JAILLET 提交于
[ Upstream commit d23dbc479a8e813db4161a695d67da0e36557846 ] The '.exit' functions from 'pernet_operations' structure should be marked as __net_exit, not __net_init. Fixes: d862e546 ("net: ipv6: Implement /proc/net/icmp6.") Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Nicolas Dichtel 提交于
[ Upstream commit 94a72b3f024fc7e9ab640897a1e38583a470659d ] NLM_F_MULTI must be used only when a NLMSG_DONE message is sent at the end. In fact, NLMSG_DONE is sent only at the end of a dump. Libraries like libnl will wait forever for NLMSG_DONE. Fixes: 949f1e39 ("bridge: mdb: notify on router port add and del") CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 16 9月, 2019 4 次提交
-
-
由 Manikanta Pubbisetty 提交于
[ Upstream commit e6f4051123fd33901e9655a675b22aefcdc5d277 ] Commit 33d915d9e8ce ("{nl,mac}80211: allow 4addr AP operation on crypto controlled devices") has introduced a change which allows 4addr operation on crypto controlled devices (ex: ath10k). This change has inadvertently impacted the interface combinations logic on such devices. General rule is that software interfaces like AP/VLAN should not be listed under supported interface combinations and should not be considered during validation of these combinations; because of the aforementioned change, AP/VLAN interfaces(if present) will be checked against interfaces supported by the device and blocks valid interface combinations. Consider a case where an AP and AP/VLAN are up and running; when a second AP device is brought up on the same physical device, this AP will be checked against the AP/VLAN interface (which will not be part of supported interface combinations of the device) and blocks second AP to come up. Add a new API cfg80211_iftype_allowed() to fix the problem, this API works for all devices with/without SW crypto control. Signed-off-by: NManikanta Pubbisetty <mpubbise@codeaurora.org> Fixes: 33d915d9e8ce ("{nl,mac}80211: allow 4addr AP operation on crypto controlled devices") Link: https://lore.kernel.org/r/1563779690-9716-1-git-send-email-mpubbise@codeaurora.orgSigned-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Dexuan Cui 提交于
[ Upstream commit 685703b497bacea8765bb409d6b73455b73c540e ] There is a race condition for an established connection that is being closed by the guest: the refcnt is 4 at the end of hvs_release() (Note: here the 'remove_sock' is false): 1 for the initial value; 1 for the sk being in the bound list; 1 for the sk being in the connected list; 1 for the delayed close_work. After hvs_release() finishes, __vsock_release() -> sock_put(sk) *may* decrease the refcnt to 3. Concurrently, hvs_close_connection() runs in another thread: calls vsock_remove_sock() to decrease the refcnt by 2; call sock_put() to decrease the refcnt to 0, and free the sk; next, the "release_sock(sk)" may hang due to use-after-free. In the above, after hvs_release() finishes, if hvs_close_connection() runs faster than "__vsock_release() -> sock_put(sk)", then there is not any issue, because at the beginning of hvs_close_connection(), the refcnt is still 4. The issue can be resolved if an extra reference is taken when the connection is established. Fixes: a9eeb998c28d ("hv_sock: Add support for delayed close") Signed-off-by: NDexuan Cui <decui@microsoft.com> Reviewed-by: NSunil Muthuswamy <sunilmut@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Sven Eckelmann 提交于
commit a15d56a60760aa9dbe26343b9a0ac5228f35d445 upstream. Multiple batadv_ogm_packet can be stored in an skbuff. The functions batadv_iv_ogm_send_to_if()/batadv_iv_ogm_receive() use batadv_iv_ogm_aggr_packet() to check if there is another additional batadv_ogm_packet in the skb or not before they continue processing the packet. The length for such an OGM is BATADV_OGM_HLEN + batadv_ogm_packet->tvlv_len. The check must first check that at least BATADV_OGM_HLEN bytes are available before it accesses tvlv_len (which is part of the header. Otherwise it might try read outside of the currently available skbuff to get the content of tvlv_len. Fixes: ef261577 ("batman-adv: tvlv - basic infrastructure") Reported-by: syzbot+355cab184197dbbfa384@syzkaller.appspotmail.com Signed-off-by: NSven Eckelmann <sven@narfation.org> Acked-by: NAntonio Quartulli <a@unstable.cc> Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Eric Dumazet 提交于
commit 3ee1bb7aae97324ec9078da1f00cb2176919563f upstream. batadv_netlink_get_ifindex() needs to make sure user passed a correct u32 attribute. syzbot reported : BUG: KMSAN: uninit-value in batadv_netlink_dump_hardif+0x70d/0x880 net/batman-adv/netlink.c:968 CPU: 1 PID: 11705 Comm: syz-executor888 Not tainted 5.1.0+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x191/0x1f0 lib/dump_stack.c:113 kmsan_report+0x130/0x2a0 mm/kmsan/kmsan.c:622 __msan_warning+0x75/0xe0 mm/kmsan/kmsan_instr.c:310 batadv_netlink_dump_hardif+0x70d/0x880 net/batman-adv/netlink.c:968 genl_lock_dumpit+0xc6/0x130 net/netlink/genetlink.c:482 netlink_dump+0xa84/0x1ab0 net/netlink/af_netlink.c:2253 __netlink_dump_start+0xa3a/0xb30 net/netlink/af_netlink.c:2361 genl_family_rcv_msg net/netlink/genetlink.c:550 [inline] genl_rcv_msg+0xfc1/0x1a40 net/netlink/genetlink.c:627 netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2486 genl_rcv+0x63/0x80 net/netlink/genetlink.c:638 netlink_unicast_kernel net/netlink/af_netlink.c:1311 [inline] netlink_unicast+0xf3e/0x1020 net/netlink/af_netlink.c:1337 netlink_sendmsg+0x127e/0x12f0 net/netlink/af_netlink.c:1926 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:661 [inline] ___sys_sendmsg+0xcc6/0x1200 net/socket.c:2260 __sys_sendmsg net/socket.c:2298 [inline] __do_sys_sendmsg net/socket.c:2307 [inline] __se_sys_sendmsg+0x305/0x460 net/socket.c:2305 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2305 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x440209 Fixes: b60620cf ("batman-adv: netlink: hardif query") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NSven Eckelmann <sven@narfation.org> Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 10 9月, 2019 9 次提交
-
-
由 Pablo Neira Ayuso 提交于
[ Upstream commit dfe42be15fde16232340b8b2a57c359f51cc10d9 ] TCP rst and fin packets do not qualify to place a flow into the flowtable. Most likely there will be no more packets after connection closure. Without this patch, this flow entry expires and connection tracking picks up the entry in ESTABLISHED state using the fixup timeout, which makes this look inconsistent to the user for a connection that is actually already closed. Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Pablo Neira Ayuso 提交于
[ Upstream commit 6a0a8d10a3661a036b55af695542a714c429ab7c ] If a rule that has already a bound anonymous set fails to be added, the preparation phase releases the rule and the bound set. However, the transaction object from the abort path still has a reference to the set object that is stale, leading to a use-after-free when checking for the set->bound field. Add a new field to the transaction that specifies if the set is bound, so the abort path can skip releasing it since the rule command owns it and it takes care of releasing it. After this update, the set->bound field is removed. [ 24.649883] Unable to handle kernel paging request at virtual address 0000000000040434 [ 24.657858] Mem abort info: [ 24.660686] ESR = 0x96000004 [ 24.663769] Exception class = DABT (current EL), IL = 32 bits [ 24.669725] SET = 0, FnV = 0 [ 24.672804] EA = 0, S1PTW = 0 [ 24.675975] Data abort info: [ 24.678880] ISV = 0, ISS = 0x00000004 [ 24.682743] CM = 0, WnR = 0 [ 24.685723] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000428952000 [ 24.692207] [0000000000040434] pgd=0000000000000000 [ 24.697119] Internal error: Oops: 96000004 [#1] SMP [...] [ 24.889414] Call trace: [ 24.891870] __nf_tables_abort+0x3f0/0x7a0 [ 24.895984] nf_tables_abort+0x20/0x40 [ 24.899750] nfnetlink_rcv_batch+0x17c/0x588 [ 24.904037] nfnetlink_rcv+0x13c/0x190 [ 24.907803] netlink_unicast+0x18c/0x208 [ 24.911742] netlink_sendmsg+0x1b0/0x350 [ 24.915682] sock_sendmsg+0x4c/0x68 [ 24.919185] ___sys_sendmsg+0x288/0x2c8 [ 24.923037] __sys_sendmsg+0x7c/0xd0 [ 24.926628] __arm64_sys_sendmsg+0x2c/0x38 [ 24.930744] el0_svc_common.constprop.0+0x94/0x158 [ 24.935556] el0_svc_handler+0x34/0x90 [ 24.939322] el0_svc+0x8/0xc [ 24.942216] Code: 37280300 f9404023 91014262 aa1703e0 (f9401863) [ 24.948336] ---[ end trace cebbb9dcbed3b56f ]--- Fixes: f6ac85858976 ("netfilter: nf_tables: unbind set in rule from commit path") Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 Ka-Cheong Poon 提交于
[ Upstream commit 7d0a06586b2686ba80c4a2da5f91cb10ffbea736 ] The rds6_inc_info_copy() function has a couple struct members which are leaking stack information. The ->tos field should hold actual information and the ->flags field needs to be zeroed out. Fixes: 3eb450367d08 ("rds: add type of service(tos) infrastructure") Fixes: b7ff8b10 ("rds: Extend RDS API for IPv6 support") Reported-by: N黄ID蝴蝶 <butterflyhuangxx@gmail.com> Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NKa-Cheong Poon <ka-cheong.poon@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Eric Dumazet 提交于
[ Upstream commit fdfc5c8594c24c5df883583ebd286321a80e0a67 ] Vladimir Rutsky reported stuck TCP sessions after memory pressure events. Edge Trigger epoll() user would never receive an EPOLLOUT notification allowing them to retry a sendmsg(). Jason tested the case of sk_stream_alloc_skb() returning NULL, but there are other paths that could lead both sendmsg() and sendpage() to return -1 (EAGAIN), with an empty skb queued on the write queue. This patch makes sure we remove this empty skb so that Jason code can detect that the queue is empty, and call sk->sk_write_space(sk) accordingly. Fixes: ce5ec440 ("tcp: ensure epoll edge trigger wakeup when write queue is empty") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Jason Baron <jbaron@akamai.com> Reported-by: NVladimir Rutsky <rutsky@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Willem de Bruijn 提交于
[ Upstream commit 888a5c53c0d8be6e98bc85b677f179f77a647873 ] TCP associates tx timestamp requests with a byte in the bytestream. If merging skbs in tcp_mtu_probe, migrate the tstamp request. Similar to MSG_EOR, do not allow moving a timestamp from any segment in the probe but the last. This to avoid merging multiple timestamps. Tested with the packetdrill script at https://github.com/wdebruij/packetdrill/commits/mtu_probe-1 Link: http://patchwork.ozlabs.org/patch/1143278/#2232897 Fixes: 4ed2d765 ("net-timestamp: TCP timestamping") Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Cong Wang 提交于
[ Upstream commit 981471bd3abf4d572097645d765391533aac327d ] The net pointer in struct xt_tgdtor_param is not explicitly initialized therefore is still NULL when dereferencing it. So we have to find a way to pass the correct net pointer to ipt_destroy_target(). The best way I find is just saving the net pointer inside the per netns struct tcf_idrinfo, which could make this patch smaller. Fixes: 0c66dc1e ("netfilter: conntrack: register hooks in netns when needed by ruleset") Reported-and-tested-by: itugrok@yahoo.com Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Vlad Buslov 提交于
[ Upstream commit dbf47a2a094edf58983265e323ca4bdcdb58b5ee ] Action sample doesn't properly handle psample_group pointer in overwrite case. Following issues need to be fixed: - In tcf_sample_init() function RCU_INIT_POINTER() is used to set s->psample_group, even though we neither setting the pointer to NULL, nor preventing concurrent readers from accessing the pointer in some way. Use rcu_swap_protected() instead to safely reset the pointer. - Old value of s->psample_group is not released or deallocated in any way, which results resource leak. Use psample_group_put() on non-NULL value obtained with rcu_swap_protected(). - The function psample_group_put() that released reference to struct psample_group pointed by rcu-pointer s->psample_group doesn't respect rcu grace period when deallocating it. Extend struct psample_group with rcu head and use kfree_rcu when freeing it. Fixes: 5c5670fa ("net/sched: Introduce sample tc action") Signed-off-by: NVlad Buslov <vladbu@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Feng Sun 提交于
[ Upstream commit 2c1644cf6d46a8267d79ed95cb9b563839346562 ] After commit baeababb ("tun: return NET_XMIT_DROP for dropped packets"), when tun_net_xmit drop packets, it will free skb and return NET_XMIT_DROP, netpoll_send_skb_on_dev will run into following use after free cases: 1. retry netpoll_start_xmit with freed skb; 2. queue freed skb in npinfo->txq. queue_process will also run into use after free case. hit netpoll_send_skb_on_dev first case with following kernel log: [ 117.864773] kernel BUG at mm/slub.c:306! [ 117.864773] invalid opcode: 0000 [#1] SMP PTI [ 117.864774] CPU: 3 PID: 2627 Comm: loop_printmsg Kdump: loaded Tainted: P OE 5.3.0-050300rc5-generic #201908182231 [ 117.864775] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 117.864775] RIP: 0010:kmem_cache_free+0x28d/0x2b0 [ 117.864781] Call Trace: [ 117.864781] ? tun_net_xmit+0x21c/0x460 [ 117.864781] kfree_skbmem+0x4e/0x60 [ 117.864782] kfree_skb+0x3a/0xa0 [ 117.864782] tun_net_xmit+0x21c/0x460 [ 117.864782] netpoll_start_xmit+0x11d/0x1b0 [ 117.864788] netpoll_send_skb_on_dev+0x1b8/0x200 [ 117.864789] __br_forward+0x1b9/0x1e0 [bridge] [ 117.864789] ? skb_clone+0x53/0xd0 [ 117.864790] ? __skb_clone+0x2e/0x120 [ 117.864790] deliver_clone+0x37/0x50 [bridge] [ 117.864790] maybe_deliver+0x89/0xc0 [bridge] [ 117.864791] br_flood+0x6c/0x130 [bridge] [ 117.864791] br_dev_xmit+0x315/0x3c0 [bridge] [ 117.864792] netpoll_start_xmit+0x11d/0x1b0 [ 117.864792] netpoll_send_skb_on_dev+0x1b8/0x200 [ 117.864792] netpoll_send_udp+0x2c6/0x3e8 [ 117.864793] write_msg+0xd9/0xf0 [netconsole] [ 117.864793] console_unlock+0x386/0x4e0 [ 117.864793] vprintk_emit+0x17e/0x280 [ 117.864794] vprintk_default+0x29/0x50 [ 117.864794] vprintk_func+0x4c/0xbc [ 117.864794] printk+0x58/0x6f [ 117.864795] loop_fun+0x24/0x41 [printmsg_loop] [ 117.864795] kthread+0x104/0x140 [ 117.864795] ? 0xffffffffc05b1000 [ 117.864796] ? kthread_park+0x80/0x80 [ 117.864796] ret_from_fork+0x35/0x40 Signed-off-by: NFeng Sun <loyou85@gmail.com> Signed-off-by: NXiaojun Zhao <xiaojunzhao141@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Eric Dumazet 提交于
[ Upstream commit a84d016479896b5526a2cc54784e6ffc41c9d6f6 ] Similar to the fix done for IPv4 in commit e5b1c6c6277d ("igmp: fix memory leak in igmpv3_del_delrec()"), we need to make sure mca_tomb and mca_sources are not blindly overwritten. Using swap() then a call to ip6_mc_clear_src() will take care of the missing free. BUG: memory leak unreferenced object 0xffff888117d9db00 (size 64): comm "syz-executor247", pid 6918, jiffies 4294943989 (age 25.350s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 fe 88 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000005b463030>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline] [<000000005b463030>] slab_post_alloc_hook mm/slab.h:522 [inline] [<000000005b463030>] slab_alloc mm/slab.c:3319 [inline] [<000000005b463030>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548 [<00000000939cbf94>] kmalloc include/linux/slab.h:552 [inline] [<00000000939cbf94>] kzalloc include/linux/slab.h:748 [inline] [<00000000939cbf94>] ip6_mc_add1_src net/ipv6/mcast.c:2236 [inline] [<00000000939cbf94>] ip6_mc_add_src+0x31f/0x420 net/ipv6/mcast.c:2356 [<00000000d8972221>] ip6_mc_source+0x4a8/0x600 net/ipv6/mcast.c:449 [<000000002b203d0d>] do_ipv6_setsockopt.isra.0+0x1b92/0x1dd0 net/ipv6/ipv6_sockglue.c:748 [<000000001f1e2d54>] ipv6_setsockopt+0x89/0xd0 net/ipv6/ipv6_sockglue.c:944 [<00000000c8f7bdf9>] udpv6_setsockopt+0x4e/0x90 net/ipv6/udp.c:1558 [<000000005a9a0c5e>] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3139 [<00000000910b37b2>] __sys_setsockopt+0x10f/0x220 net/socket.c:2084 [<00000000e9108023>] __do_sys_setsockopt net/socket.c:2100 [inline] [<00000000e9108023>] __se_sys_setsockopt net/socket.c:2097 [inline] [<00000000e9108023>] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2097 [<00000000f4818160>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:296 [<000000008d367e8f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 1666d49e ("mld: do not remove mld souce list info when set link down") Fixes: 9c8bb163 ("igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 06 9月, 2019 12 次提交
-
-
由 Denis Kenzior 提交于
commit f8b43c5cf4b62a19f2210a0f5367b84e1eff1ab9 upstream. The noencrypt flag was intended to be set if the "frame was received unencrypted" according to include/uapi/linux/nl80211.h. However, the current behavior is opposite of this. Cc: stable@vger.kernel.org Fixes: 018f6fbf ("mac80211: Send control port frames over nl80211") Signed-off-by: NDenis Kenzior <denkenz@gmail.com> Link: https://lore.kernel.org/r/20190827224120.14545-3-denkenz@gmail.comSigned-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Denis Kenzior 提交于
commit c8a41c6afa27b8c3f61622dfd882b912da9d6721 upstream. In ieee80211_deliver_skb_to_local_stack intercepts EAPoL frames if mac80211 is configured to do so and forwards the contents over nl80211. During this process some additional data is also forwarded, including whether the frame was received encrypted or not. Unfortunately just prior to the call to ieee80211_deliver_skb_to_local_stack, skb->cb is cleared, resulting in incorrect data being exposed over nl80211. Fixes: 018f6fbf ("mac80211: Send control port frames over nl80211") Cc: stable@vger.kernel.org Signed-off-by: NDenis Kenzior <denkenz@gmail.com> Link: https://lore.kernel.org/r/20190827224120.14545-2-denkenz@gmail.comSigned-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Johannes Berg 提交于
commit 5fd2f91ad483baffdbe798f8a08f1b41442d1e24 upstream. If TDLS station addition is rejected, the sta memory is leaked. Avoid this by moving the check before the allocation. Cc: stable@vger.kernel.org Fixes: 7ed5285396c2 ("mac80211: don't initiate TDLS connection if station is not associated to AP") Link: https://lore.kernel.org/r/20190801073033.7892-1-johannes@sipsolutions.netSigned-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Hodaszi, Robert 提交于
commit 0d31d4dbf38412f5b8b11b4511d07b840eebe8cb upstream. This reverts commit 96cce12f ("cfg80211: fix processing world regdomain when non modular"). Re-triggering a reg_process_hint with the last request on all events, can make the regulatory domain fail in case of multiple WiFi modules. On slower boards (espacially with mdev), enumeration of the WiFi modules can end up in an intersected regulatory domain, and user cannot set it with 'iw reg set' anymore. This is happening, because: - 1st module enumerates, queues up a regulatory request - request gets processed by __reg_process_hint_driver(): - checks if previous was set by CORE -> yes - checks if regulator domain changed -> yes, from '00' to e.g. 'US' -> sends request to the 'crda' - 2nd module enumerates, queues up a regulator request (which triggers the reg_todo() work) - reg_todo() -> reg_process_pending_hints() sees, that the last request is not processed yet, so it tries to process it again. __reg_process_hint driver() will run again, and: - checks if the last request's initiator was the core -> no, it was the driver (1st WiFi module) - checks, if the previous initiator was the driver -> yes - checks if the regulator domain changed -> yes, it was '00' (set by core, and crda call did not return yet), and should be changed to 'US' ------> __reg_process_hint_driver calls an intersect Besides, the reg_process_hint call with the last request is meaningless since the crda call has a timeout work. If that timeout expires, the first module's request will lost. Cc: stable@vger.kernel.org Fixes: 96cce12f ("cfg80211: fix processing world regdomain when non modular") Signed-off-by: NRobert Hodaszi <robert.hodaszi@digi.com> Link: https://lore.kernel.org/r/20190614131600.GA13897@a1-hrSigned-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Hangbin Liu 提交于
[ Upstream commit e2c693934194fd3b4e795635934883354c06ebc9 ] In __icmp_send() there is a possibility that the rt->dst.dev is NULL, e,g, with tunnel collect_md mode, which will cause kernel crash. Here is what the code path looks like, for GRE: - ip6gre_tunnel_xmit - ip6gre_xmit_ipv4 - __gre6_xmit - ip6_tnl_xmit - if skb->len - t->tun_hlen - eth_hlen > mtu; return -EMSGSIZE - icmp_send - net = dev_net(rt->dst.dev); <-- here The reason is __metadata_dst_init() init dst->dev to NULL by default. We could not fix it in __metadata_dst_init() as there is no dev supplied. On the other hand, the reason we need rt->dst.dev is to get the net. So we can just try get it from skb->dev when rt->dst.dev is NULL. v4: Julian Anastasov remind skb->dev also could be NULL. We'd better still use dst.dev and do a check to avoid crash. v3: No changes. v2: fix the issue in __icmp_send() instead of updating shared dst dev in {ip_md, ip6}_tunnel_xmit. Fixes: c8b34e680a09 ("ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit") Signed-off-by: NHangbin Liu <liuhangbin@gmail.com> Reviewed-by: NJulian Anastasov <ja@ssi.bg> Acked-by: NJonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Eric Dumazet 提交于
[ Upstream commit ef8d8ccdc216f797e66cb4a1372f5c4c285ce1e4 ] As Jason Baron explained in commit 790ba456 ("tcp: set SOCK_NOSPACE under memory pressure"), it is crucial we properly set SOCK_NOSPACE when needed. However, Jason patch had a bug, because the 'nonblocking' status as far as sk_stream_wait_memory() is concerned is governed by MSG_DONTWAIT flag passed at sendmsg() time : long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(), and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE cleared, if sk->sk_sndtimeo has been set to a small (but not zero) value. This patch removes the 'noblock' variable since we must always set SOCK_NOSPACE if -EAGAIN is returned. It also renames the do_nonblock label since we might reach this code path even if we were in blocking mode. Fixes: 790ba456 ("tcp: set SOCK_NOSPACE under memory pressure") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Jason Baron <jbaron@akamai.com> Reported-by: NVladimir Rutsky <rutsky@google.com> Acked-by: NSoheil Hassas Yeganeh <soheil@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Acked-by: NJason Baron <jbaron@akamai.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Jason Baron 提交于
[ Upstream commit 4651d1802f7063e4d8c0bcad957f46ece0c04024 ] Currently, we are only explicitly setting SOCK_NOSPACE on a write timeout for non-blocking sockets. Epoll() edge-trigger mode relies on SOCK_NOSPACE being set when -EAGAIN is returned to ensure that EPOLLOUT is raised. Expand the setting of SOCK_NOSPACE to non-blocking sockets as well that can use SO_SNDTIMEO to adjust their write timeout. This mirrors the behavior that Eric Dumazet introduced for tcp sockets. Signed-off-by: NJason Baron <jbaron@akamai.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Ursula Braun <ubraun@linux.ibm.com> Cc: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 David Ahern 提交于
[ Upstream commit c7036d97acd2527cef145b5ef9ad1a37ed21bbe6 ] A user reported that routes are getting installed with type 0 (RTN_UNSPEC) where before the routes were RTN_UNICAST. One example is from accel-ppp which apparently still uses the ioctl interface and does not set rtmsg_type. Another is the netlink interface where ipv6 does not require rtm_type to be set (v4 does). Prior to the commit in the Fixes tag the ipv6 stack converted type 0 to RTN_UNICAST, so restore that behavior. Fixes: e8478e80 ("net/ipv6: Save route type in rt6_info") Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Hangbin Liu 提交于
[ Upstream commit f17f7648a49aa6728649ddf79bdbcac4f1970ce4 ] In commit 93a714d6 ("multicast: Extend ip address command to enable multicast group join/leave on") we added a new flag IFA_F_MCAUTOJOIN to make user able to add multicast address on ethernet interface. This works for IPv4, but not for IPv6. See the inet6_addr_add code. static int inet6_addr_add() { ... if (cfg->ifa_flags & IFA_F_MCAUTOJOIN) { ipv6_mc_config(net->ipv6.mc_autojoin_sk, true...) } ifp = ipv6_add_addr(idev, cfg, true, extack); <- always fail with maddr if (!IS_ERR(ifp)) { ... } else if (cfg->ifa_flags & IFA_F_MCAUTOJOIN) { ipv6_mc_config(net->ipv6.mc_autojoin_sk, false...) } } But in ipv6_add_addr() it will check the address type and reject multicast address directly. So this feature is never worked for IPv6. We should not remove the multicast address check totally in ipv6_add_addr(), but could accept multicast address only when IFA_F_MCAUTOJOIN flag supplied. v2: update commit description Fixes: 93a714d6 ("multicast: Extend ip address command to enable multicast group join/leave on") Reported-by: NJianlin Shi <jishi@redhat.com> Signed-off-by: NHangbin Liu <liuhangbin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 John Fastabend 提交于
[ Upstream commit d85f01775850a35eae47a0090839baf510c1ef12 ] The ctx->sk_write_space pointer is only set when TLS tx mode is enabled. When running without TX mode its a null pointer but we still set the sk sk_write_space pointer on close(). Fix the close path to only overwrite sk->sk_write_space when the current pointer is to the tls_write_space function indicating the tls module should clean it up properly as well. Reported-by: NHillf Danton <hdanton@sina.com> Cc: Ying Xue <ying.xue@windriver.com> Cc: Andrey Konovalov <andreyknvl@google.com> Fixes: 57c722e932cfb ("net/tls: swap sk_write_space on close") Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com> Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Jakub Kicinski 提交于
[ Upstream commit 57c722e932cfb82e9820bbaae1b1f7222ea97b52 ] Now that we swap the original proto and clear the ULP pointer on close we have to make sure no callback will try to access the freed state. sk_write_space is not part of sk_prot, remember to swap it. Reported-by: syzbot+dcdc9deefaec44785f32@syzkaller.appspotmail.com Fixes: 95fa145479fb ("bpf: sockmap/tls, close can race with map free") Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Vakul Garg 提交于
[ Upstream commit 150085791afb8054e11d2e080d4b9cd755dd7f69 ] In tls_sw_sendmsg() and tls_sw_sendpage(), the variable 'ret' has been set to return value of tls_complete_pending_work(). This allows return of proper error code if tls_complete_pending_work() fails. Fixes: 3c4d7559 ("tls: kernel TLS support") Signed-off-by: NVakul Garg <vakul.garg@nxp.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 29 8月, 2019 6 次提交
-
-
由 David Howells 提交于
[ Upstream commit 68553f1a6f746bf860bce3eb42d78c26a717d9c0 ] Fix rxrpc_unuse_local() to handle a NULL local pointer as it can be called on an unbound socket on which rx->local is not yet set. The following reproduced (includes omitted): int main(void) { socket(AF_RXRPC, SOCK_DGRAM, AF_INET); return 0; } causes the following oops to occur: BUG: kernel NULL pointer dereference, address: 0000000000000010 ... RIP: 0010:rxrpc_unuse_local+0x8/0x1b ... Call Trace: rxrpc_release+0x2b5/0x338 __sock_release+0x37/0xa1 sock_close+0x14/0x17 __fput+0x115/0x1e9 task_work_run+0x72/0x98 do_exit+0x51b/0xa7a ? __context_tracking_exit+0x4e/0x10e do_group_exit+0xab/0xab __x64_sys_exit_group+0x14/0x17 do_syscall_64+0x89/0x1d4 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reported-by: syzbot+20dee719a2e090427b5f@syzkaller.appspotmail.com Fixes: 730c5fd42c1e ("rxrpc: Fix local endpoint refcounting") Signed-off-by: NDavid Howells <dhowells@redhat.com> cc: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 David Howells 提交于
[ Upstream commit b00df840fb4004b7087940ac5f68801562d0d2de ] When a local endpoint (struct rxrpc_local) ceases to be in use by any AF_RXRPC sockets, it starts the process of being destroyed, but this doesn't cause it to be removed from the namespace endpoint list immediately as tearing it down isn't trivial and can't be done in softirq context, so it gets deferred. If a new socket comes along that wants to bind to the same endpoint, a new rxrpc_local object will be allocated and rxrpc_lookup_local() will use list_replace() to substitute the new one for the old. Then, when the dying object gets to rxrpc_local_destroyer(), it is removed unconditionally from whatever list it is on by calling list_del_init(). However, list_replace() doesn't reset the pointers in the replaced list_head and so the list_del_init() will likely corrupt the local endpoints list. Fix this by using list_replace_init() instead. Fixes: 730c5fd42c1e ("rxrpc: Fix local endpoint refcounting") Reported-by: syzbot+193e29e9387ea5837f1d@syzkaller.appspotmail.com Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-
由 David Howells 提交于
commit 06d9532fa6b34f12a6d75711162d47c17c1add72 upstream. rxrpc_queue_local() attempts to queue the local endpoint it is given and then, if successful, prints a trace line. The trace line includes the current usage count - but we're not allowed to look at the local endpoint at this point as we passed our ref on it to the workqueue. Fix this by reading the usage count before queuing the work item. Also fix the reading of local->debug_id for trace lines, which must be done with the same consideration as reading the usage count. Fixes: 09d2bf59 ("rxrpc: Add a tracepoint to track rxrpc_local refcounting") Reported-by: syzbot+78e71c5bab4f76a6a719@syzkaller.appspotmail.com Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 David Howells 提交于
commit 730c5fd42c1e3652a065448fd235cb9fafb2bd10 upstream. The object lifetime management on the rxrpc_local struct is broken in that the rxrpc_local_processor() function is expected to clean up and remove an object - but it may get requeued by packets coming in on the backing UDP socket once it starts running. This may result in the assertion in rxrpc_local_rcu() firing because the memory has been scheduled for RCU destruction whilst still queued: rxrpc: Assertion failed ------------[ cut here ]------------ kernel BUG at net/rxrpc/local_object.c:468! Note that if the processor comes around before the RCU free function, it will just do nothing because ->dead is true. Fix this by adding a separate refcount to count active users of the endpoint that causes the endpoint to be destroyed when it reaches 0. The original refcount can then be used to refcount objects through the work processor and cause the memory to be rcu freed when that reaches 0. Fixes: 4f95dd78 ("rxrpc: Rework local endpoint management") Reported-by: syzbot+1e0edc4b8b7494c28450@syzkaller.appspotmail.com Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 Ilya Dryomov 提交于
commit a561372405cf6bc6f14239b3a9e57bb39f2788b0 upstream. We can't rely on ->peer_features in calc_target() because it may be called both when the OSD session is established and open and when it's not. ->peer_features is not valid unless the OSD session is open. If this happens on a PG split (pg_num increase), that could mean we don't resend a request that should have been resent, hanging the client indefinitely. In userspace this was fixed by looking at require_osd_release and get_xinfo[osd].features fields of the osdmap. However these fields belong to the OSD section of the osdmap, which the kernel doesn't decode (only the client section is decoded). Instead, let's drop this feature check. It effectively checks for luminous, so only pre-luminous OSDs would be affected in that on a PG split the kernel might resend a request that should not have been resent. Duplicates can occur in other scenarios, so both sides should already be prepared for them: see dup/replay logic on the OSD side and retry_attempt check on the client side. Cc: stable@vger.kernel.org Fixes: 7de030d6 ("libceph: resend on PG splits if OSD has RESEND_ON_SPLIT") Link: https://tracker.ceph.com/issues/41162Reported-by: NJerry Lee <leisurelysw24@gmail.com> Signed-off-by: NIlya Dryomov <idryomov@gmail.com> Tested-by: NJerry Lee <leisurelysw24@gmail.com> Reviewed-by: NJeff Layton <jlayton@kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-
由 David Howells 提交于
[ Upstream commit c69565ee6681e151e2bb80502930a16e04b553d1 ] Fix the fact that a notification isn't sent to the recvmsg side to indicate a call failed when sendmsg() fails to transmit a DATA packet with the error ENETUNREACH, EHOSTUNREACH or ECONNREFUSED. Without this notification, the afs client just sits there waiting for the call to complete in some manner (which it's not now going to do), which also pins the rxrpc call in place. This can be seen if the client has a scope-level IPv6 address, but not a global-level IPv6 address, and we try and transmit an operation to a server's IPv6 address. Looking in /proc/net/rxrpc/calls shows completed calls just sat there with an abort code of RX_USER_ABORT and an error code of -ENETUNREACH. Fixes: c54e43d7 ("rxrpc: Fix missing start of call timeout") Signed-off-by: NDavid Howells <dhowells@redhat.com> Reviewed-by: NMarc Dionne <marc.dionne@auristor.com> Reviewed-by: NJeffrey Altman <jaltman@auristor.com> Signed-off-by: NSasha Levin <sashal@kernel.org>
-