提交 7c802ed4 编写于 作者: Z Zhang Changzhong 提交者: Yang Yingliang

tcp: fix memleak when tcp internal pacing is used

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4S8PN
CVE: NA

--------------------------------

The sock_hold() in tcp_internal_pacing() is expected to pair with
sock_put() in tcp_pace_kick(). But in some path tcp_internal_pacing()
is called without checking if pacing timer is already armed, causing
sock_hold() to be called one more time and tcp sock can't be released.

As Neal pointed out, this could happen from some of the retransmission
code paths that don't use tcp_xmit_retransmit_queue(), such as
tcp_retransmit_timer() and tcp_send_loss_probe().

The fix is provided by Eric, it extends the timer to cover all these
points that Neal mentioned.

Following is the reproduce procedure provided by Jason:
0) cat /proc/slabinfo | grep TCP
1) switch net.ipv4.tcp_congestion_control to bbr
2) using wrk tool something like that to send packages
3) using tc to increase the delay and loss to simulate the RTO case.
4) cat /proc/slabinfo | grep TCP
5) kill the wrk command and observe the number of objects and slabs in
TCP.
6) at last, you could notice that the number would not decrease.

Link: https://lore.kernel.org/all/CANn89i+7-wE4xr5D9DpH+N-xkL1SB8oVghCKgz+CT5eG1ODQhA@mail.gmail.com/Signed-off-by: NZhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
上级 d2584a20
...@@ -1021,6 +1021,8 @@ enum hrtimer_restart tcp_pace_kick(struct hrtimer *timer) ...@@ -1021,6 +1021,8 @@ enum hrtimer_restart tcp_pace_kick(struct hrtimer *timer)
static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb) static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb)
{ {
struct tcp_sock *tp = tcp_sk(sk);
ktime_t expire, now;
u64 len_ns; u64 len_ns;
u32 rate; u32 rate;
...@@ -1032,12 +1034,28 @@ static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb) ...@@ -1032,12 +1034,28 @@ static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb)
len_ns = (u64)skb->len * NSEC_PER_SEC; len_ns = (u64)skb->len * NSEC_PER_SEC;
do_div(len_ns, rate); do_div(len_ns, rate);
hrtimer_start(&tcp_sk(sk)->pacing_timer, now = ktime_get();
ktime_add_ns(ktime_get(), len_ns), /* If hrtimer is already armed, then our caller has not
* used tcp_pacing_check().
*/
if (unlikely(hrtimer_is_queued(&tp->pacing_timer))) {
expire = hrtimer_get_softexpires(&tp->pacing_timer);
if (ktime_after(expire, now))
now = expire;
if (hrtimer_try_to_cancel(&tp->pacing_timer) == 1)
__sock_put(sk);
}
hrtimer_start(&tp->pacing_timer, ktime_add_ns(now, len_ns),
HRTIMER_MODE_ABS_PINNED_SOFT); HRTIMER_MODE_ABS_PINNED_SOFT);
sock_hold(sk); sock_hold(sk);
} }
static bool tcp_pacing_check(const struct sock *sk)
{
return tcp_needs_internal_pacing(sk) &&
hrtimer_is_queued(&tcp_sk(sk)->pacing_timer);
}
static void tcp_update_skb_after_send(struct tcp_sock *tp, struct sk_buff *skb) static void tcp_update_skb_after_send(struct tcp_sock *tp, struct sk_buff *skb)
{ {
skb->skb_mstamp = tp->tcp_mstamp; skb->skb_mstamp = tp->tcp_mstamp;
...@@ -2174,6 +2192,9 @@ static int tcp_mtu_probe(struct sock *sk) ...@@ -2174,6 +2192,9 @@ static int tcp_mtu_probe(struct sock *sk)
if (!tcp_can_coalesce_send_queue_head(sk, probe_size)) if (!tcp_can_coalesce_send_queue_head(sk, probe_size))
return -1; return -1;
if (tcp_pacing_check(sk))
return -1;
/* We're allowed to probe. Build it now. */ /* We're allowed to probe. Build it now. */
nskb = sk_stream_alloc_skb(sk, probe_size, GFP_ATOMIC, false); nskb = sk_stream_alloc_skb(sk, probe_size, GFP_ATOMIC, false);
if (!nskb) if (!nskb)
...@@ -2247,12 +2268,6 @@ static int tcp_mtu_probe(struct sock *sk) ...@@ -2247,12 +2268,6 @@ static int tcp_mtu_probe(struct sock *sk)
return -1; return -1;
} }
static bool tcp_pacing_check(const struct sock *sk)
{
return tcp_needs_internal_pacing(sk) &&
hrtimer_is_queued(&tcp_sk(sk)->pacing_timer);
}
/* TCP Small Queues : /* TCP Small Queues :
* Control number of packets in qdisc/devices to two packets / or ~1 ms. * Control number of packets in qdisc/devices to two packets / or ~1 ms.
* (These limits are doubled for retransmits) * (These limits are doubled for retransmits)
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册