提交 7d8677ff 编写于 作者: J Jesper Dangaard Brouer 提交者: Greg Kroah-Hartman

xdp: fix cpumap redirect SKB creation bug

[ Upstream commit 676e4a6fe703f2dae699ee9d56f14516f9ada4ea ]

We want to avoid leaking pointer info from xdp_frame (that is placed in
top of frame) like commit 6dfb970d ("xdp: avoid leaking info stored in
frame data on page reuse"), and followup commit 97e19cce ("bpf:
reserve xdp_frame size in xdp headroom") that reserve this headroom.

These changes also affected how cpumap constructed SKBs, as xdpf->headroom
size changed, the skb data starting point were in-effect shifted with 32
bytes (sizeof xdp_frame). This was still okay, as the cpumap frame_size
calculation also included xdpf->headroom which were reduced by same amount.

A bug was introduced in commit 77ea5f4cbe20 ("bpf/cpumap: make sure
frame_size for build_skb is aligned if headroom isn't"), where the
xdpf->headroom became part of the SKB_DATA_ALIGN rounding up. This
round-up to find the frame_size is in principle still correct as it does
not exceed the 2048 bytes frame_size (which is max for ixgbe and i40e),
but the 32 bytes offset of pkt_data_start puts this over the 2048 bytes
limit. This cause skb_shared_info to spill into next frame. It is a little
hard to trigger, as the SKB need to use above 15 skb_shinfo->frags[] as
far as I calculate. This does happen in practise for TCP streams when
skb_try_coalesce() kicks in.

KASAN can be used to detect these wrong memory accesses, I've seen:
 BUG: KASAN: use-after-free in skb_try_coalesce+0x3cb/0x760
 BUG: KASAN: wild-memory-access in skb_release_data+0xe2/0x250

Driver veth also construct a SKB from xdp_frame in this way, but is not
affected, as it doesn't reserve/deduct the room (used by xdp_frame) from
the SKB headroom. Instead is clears the pointers via xdp_scrub_frame(),
and allows SKB to use this area.

The fix in this patch is to do like veth and instead allow SKB to (re)use
the area occupied by xdp_frame, by clearing via xdp_scrub_frame().  (This
does kill the idea of the SKB being able to access (mem) info from this
area, but I guess it was a bad idea anyhow, and it was already killed by
the veth changes.)

Fixes: 77ea5f4cbe20 ("bpf/cpumap: make sure frame_size for build_skb is aligned if headroom isn't")
Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NSasha Levin <sashal@kernel.org>
上级 32a24a39
...@@ -162,10 +162,14 @@ static void cpu_map_kthread_stop(struct work_struct *work) ...@@ -162,10 +162,14 @@ static void cpu_map_kthread_stop(struct work_struct *work)
static struct sk_buff *cpu_map_build_skb(struct bpf_cpu_map_entry *rcpu, static struct sk_buff *cpu_map_build_skb(struct bpf_cpu_map_entry *rcpu,
struct xdp_frame *xdpf) struct xdp_frame *xdpf)
{ {
unsigned int hard_start_headroom;
unsigned int frame_size; unsigned int frame_size;
void *pkt_data_start; void *pkt_data_start;
struct sk_buff *skb; struct sk_buff *skb;
/* Part of headroom was reserved to xdpf */
hard_start_headroom = sizeof(struct xdp_frame) + xdpf->headroom;
/* build_skb need to place skb_shared_info after SKB end, and /* build_skb need to place skb_shared_info after SKB end, and
* also want to know the memory "truesize". Thus, need to * also want to know the memory "truesize". Thus, need to
* know the memory frame size backing xdp_buff. * know the memory frame size backing xdp_buff.
...@@ -183,15 +187,15 @@ static struct sk_buff *cpu_map_build_skb(struct bpf_cpu_map_entry *rcpu, ...@@ -183,15 +187,15 @@ static struct sk_buff *cpu_map_build_skb(struct bpf_cpu_map_entry *rcpu,
* is not at a fixed memory location, with mixed length * is not at a fixed memory location, with mixed length
* packets, which is bad for cache-line hotness. * packets, which is bad for cache-line hotness.
*/ */
frame_size = SKB_DATA_ALIGN(xdpf->len + xdpf->headroom) + frame_size = SKB_DATA_ALIGN(xdpf->len + hard_start_headroom) +
SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
pkt_data_start = xdpf->data - xdpf->headroom; pkt_data_start = xdpf->data - hard_start_headroom;
skb = build_skb(pkt_data_start, frame_size); skb = build_skb(pkt_data_start, frame_size);
if (!skb) if (!skb)
return NULL; return NULL;
skb_reserve(skb, xdpf->headroom); skb_reserve(skb, hard_start_headroom);
__skb_put(skb, xdpf->len); __skb_put(skb, xdpf->len);
if (xdpf->metasize) if (xdpf->metasize)
skb_metadata_set(skb, xdpf->metasize); skb_metadata_set(skb, xdpf->metasize);
...@@ -205,6 +209,9 @@ static struct sk_buff *cpu_map_build_skb(struct bpf_cpu_map_entry *rcpu, ...@@ -205,6 +209,9 @@ static struct sk_buff *cpu_map_build_skb(struct bpf_cpu_map_entry *rcpu,
* - RX ring dev queue index (skb_record_rx_queue) * - RX ring dev queue index (skb_record_rx_queue)
*/ */
/* Allow SKB to reuse area used by xdp_frame */
xdp_scrub_frame(xdpf);
return skb; return skb;
} }
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册