提交 c05cdb1b 编写于 作者: P Pablo Neira Ayuso 提交者: David S. Miller

netlink: allow large data transfers from user-space

I can hit ENOBUFS in the sendmsg() path with a large batch that is
composed of many netlink messages. Here that limit is 8 MBytes of
skbuff data area as kmalloc does not manage to get more than that.

While discussing atomic rule-set for nftables with Patrick McHardy,
we decided to put all rule-set updates that need to be applied
atomically in one single batch to simplify the existing approach.
However, as explained above, the existing netlink code limits us
to a maximum of ~20000 rules that fit in one single batch without
hitting ENOBUFS. iptables does not have such limitation as it is
using vmalloc.

This patch adds netlink_alloc_large_skb() which is only used in
the netlink_sendmsg() path. It uses alloc_skb if the memory
requested is <= one memory page, that should be the common case
for most subsystems, else vmalloc for higher memory allocations.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
上级 1b5acd29
...@@ -750,6 +750,10 @@ static void netlink_skb_destructor(struct sk_buff *skb) ...@@ -750,6 +750,10 @@ static void netlink_skb_destructor(struct sk_buff *skb)
skb->head = NULL; skb->head = NULL;
} }
#endif #endif
if (is_vmalloc_addr(skb->head)) {
vfree(skb->head);
skb->head = NULL;
}
if (skb->sk != NULL) if (skb->sk != NULL)
sock_rfree(skb); sock_rfree(skb);
} }
...@@ -1420,6 +1424,35 @@ struct sock *netlink_getsockbyfilp(struct file *filp) ...@@ -1420,6 +1424,35 @@ struct sock *netlink_getsockbyfilp(struct file *filp)
return sock; return sock;
} }
static struct sk_buff *netlink_alloc_large_skb(unsigned int size)
{
struct sk_buff *skb;
void *data;
if (size <= NLMSG_GOODSIZE)
return alloc_skb(size, GFP_KERNEL);
skb = alloc_skb_head(GFP_KERNEL);
if (skb == NULL)
return NULL;
data = vmalloc(size);
if (data == NULL)
goto err;
skb->head = data;
skb->data = data;
skb_reset_tail_pointer(skb);
skb->end = skb->tail + size;
skb->len = 0;
skb->destructor = netlink_skb_destructor;
return skb;
err:
kfree_skb(skb);
return NULL;
}
/* /*
* Attach a skb to a netlink socket. * Attach a skb to a netlink socket.
* The caller must hold a reference to the destination socket. On error, the * The caller must hold a reference to the destination socket. On error, the
...@@ -1510,7 +1543,7 @@ static struct sk_buff *netlink_trim(struct sk_buff *skb, gfp_t allocation) ...@@ -1510,7 +1543,7 @@ static struct sk_buff *netlink_trim(struct sk_buff *skb, gfp_t allocation)
return skb; return skb;
delta = skb->end - skb->tail; delta = skb->end - skb->tail;
if (delta * 2 < skb->truesize) if (is_vmalloc_addr(skb->head) || delta * 2 < skb->truesize)
return skb; return skb;
if (skb_shared(skb)) { if (skb_shared(skb)) {
...@@ -2096,7 +2129,7 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock, ...@@ -2096,7 +2129,7 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock,
if (len > sk->sk_sndbuf - 32) if (len > sk->sk_sndbuf - 32)
goto out; goto out;
err = -ENOBUFS; err = -ENOBUFS;
skb = alloc_skb(len, GFP_KERNEL); skb = netlink_alloc_large_skb(len);
if (skb == NULL) if (skb == NULL)
goto out; goto out;
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册