提交 c0d95d33 编写于 作者: J John Fastabend 提交者: Daniel Borkmann

bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap

When a sock is added to a sock map we evaluate what proto op hooks need to
be used. However, when the program is removed from the sock map we have not
been evaluating if that changes the required program layout.

Before the patch listed in the 'fixes' tag this was not causing failures
because the base program set handles all cases. Specifically, the case with
a stream parser and the case with out a stream parser are both handled. With
the fix below we identified a race when running with a proto op that attempts
to read skbs off both the stream parser and the skb->receive_queue. Namely,
that a race existed where when the stream parser is empty checking the
skb->receive_queue from recvmsg at the precies moment when the parser is
paused and the receive_queue is not empty could result in skipping the stream
parser. This may break a RX policy depending on the parser to run.

The fix tag then loads a specific proto ops that resolved this race. But, we
missed removing that proto ops recv hook when the sock is removed from the
sockmap. The result is the stream parser is stopped so no more skbs will be
aggregated there, but the hook and BPF program continues to be attached on
the psock. User space will then get an EBUSY when trying to read the socket
because the recvmsg() handler is now waiting on a stopped stream parser.

To fix we rerun the proto ops init() function which will look at the new set
of progs attached to the psock and rest the proto ops hook to the correct
handlers. And in the above case where we remove the sock from the sock map
the RX prog will no longer be listed so the proto ops is removed.

Fixes: c5d2177a ("bpf, sockmap: Fix race in ingress receive verdict with redirect to self")
Signed-off-by: NJohn Fastabend <john.fastabend@gmail.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211119181418.353932-3-john.fastabend@gmail.com
上级 38207a5e
......@@ -1124,6 +1124,8 @@ void sk_psock_start_strp(struct sock *sk, struct sk_psock *psock)
void sk_psock_stop_strp(struct sock *sk, struct sk_psock *psock)
{
psock_set_prog(&psock->progs.stream_parser, NULL);
if (!psock->saved_data_ready)
return;
......@@ -1212,6 +1214,9 @@ void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
void sk_psock_stop_verdict(struct sock *sk, struct sk_psock *psock)
{
psock_set_prog(&psock->progs.stream_verdict, NULL);
psock_set_prog(&psock->progs.skb_verdict, NULL);
if (!psock->saved_data_ready)
return;
......
......@@ -167,8 +167,11 @@ static void sock_map_del_link(struct sock *sk,
write_lock_bh(&sk->sk_callback_lock);
if (strp_stop)
sk_psock_stop_strp(sk, psock);
else
if (verdict_stop)
sk_psock_stop_verdict(sk, psock);
if (psock->psock_update_sk_prot)
psock->psock_update_sk_prot(sk, psock, false);
write_unlock_bh(&sk->sk_callback_lock);
}
}
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册