提交 2a028ecb 编写于 作者: E Eric Dumazet 提交者: David S. Miller

net: allow BH servicing in sk_busy_loop()

Instead of blocking BH in whole sk_busy_loop(), block them
only around ->ndo_busy_poll() calls.

This has many benefits.

1) allow tunneled traffic to use busy poll as well as native traffic.
   Tunnels handlers usually call netif_rx() and depend on net_rx_action()
   being run (from sofirq handler)

2) allow RFS/RPS being used (sending IPI to other cpus if needed)

3) use the 'lets burn cpu cycles' budget to do useful work
   (like TX completions, timers, RCU callbacks...)

4) reduce BH latencies, making busy poll a better citizen.

Tested:

Tested with SIT tunnel

lpaa5:~# echo 0 >/proc/sys/net/core/busy_read
lpaa5:~# ./netperf -H 2002:af6:786::1 -t TCP_RR
MIGRATED TCP REQUEST/RESPONSE TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:786::1 () port 0 AF_INET6 : first burst 0
Local /Remote
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       10.00    37373.93
16384  87380

Now enable busy poll on both hosts

lpaa5:~# echo 70 >/proc/sys/net/core/busy_read
lpaa6:~# echo 70 >/proc/sys/net/core/busy_read

lpaa5:~# ./netperf -H 2002:af6:786::1 -t TCP_RR
MIGRATED TCP REQUEST/RESPONSE TEST from ::0 (::) port 0 AF_INET6 to 2002:af6:786::1 () port 0 AF_INET6 : first burst 0
Local /Remote
Socket Size   Request  Resp.   Elapsed  Trans.
Send   Recv   Size     Size    Time     Rate
bytes  Bytes  bytes    bytes   secs.    per sec

16384  87380  1        1       10.00    58314.77
16384  87380
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
上级 02d62e86
...@@ -4684,11 +4684,7 @@ bool sk_busy_loop(struct sock *sk, int nonblock) ...@@ -4684,11 +4684,7 @@ bool sk_busy_loop(struct sock *sk, int nonblock)
struct napi_struct *napi; struct napi_struct *napi;
int rc = false; int rc = false;
/* rcu_read_lock();
* rcu read lock for napi hash
* bh so we don't race with net_rx_action
*/
rcu_read_lock_bh();
napi = napi_by_id(sk->sk_napi_id); napi = napi_by_id(sk->sk_napi_id);
if (!napi) if (!napi)
...@@ -4699,23 +4695,23 @@ bool sk_busy_loop(struct sock *sk, int nonblock) ...@@ -4699,23 +4695,23 @@ bool sk_busy_loop(struct sock *sk, int nonblock)
goto out; goto out;
do { do {
local_bh_disable();
rc = ops->ndo_busy_poll(napi); rc = ops->ndo_busy_poll(napi);
if (rc > 0)
NET_ADD_STATS_BH(sock_net(sk),
LINUX_MIB_BUSYPOLLRXPACKETS, rc);
local_bh_enable();
if (rc == LL_FLUSH_FAILED) if (rc == LL_FLUSH_FAILED)
break; /* permanent failure */ break; /* permanent failure */
if (rc > 0)
/* local bh are disabled so it is ok to use _BH */
NET_ADD_STATS_BH(sock_net(sk),
LINUX_MIB_BUSYPOLLRXPACKETS, rc);
cpu_relax(); cpu_relax();
} while (!nonblock && skb_queue_empty(&sk->sk_receive_queue) && } while (!nonblock && skb_queue_empty(&sk->sk_receive_queue) &&
!need_resched() && !busy_loop_timeout(end_time)); !need_resched() && !busy_loop_timeout(end_time));
rc = !skb_queue_empty(&sk->sk_receive_queue); rc = !skb_queue_empty(&sk->sk_receive_queue);
out: out:
rcu_read_unlock_bh(); rcu_read_unlock();
return rc; return rc;
} }
EXPORT_SYMBOL(sk_busy_loop); EXPORT_SYMBOL(sk_busy_loop);
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册