- 26 2月, 2022 1 次提交
-
-
由 Menglong Dong 提交于
Replace kfree_skb() which is used in the packet egress path of IP layer with kfree_skb_reason(). Functions that are involved include: __ip_queue_xmit() ip_finish_output() ip_mc_finish_output() ip6_output() ip6_finish_output() ip6_finish_output2() Following new drop reasons are introduced: SKB_DROP_REASON_IP_OUTNOROUTES SKB_DROP_REASON_BPF_CGROUP_EGRESS SKB_DROP_REASON_IPV6DISABLED SKB_DROP_REASON_NEIGH_CREATEFAIL Reviewed-by: NMengen Sun <mengensun@tencent.com> Reviewed-by: NHao Peng <flyingpeng@tencent.com> Signed-off-by: NMenglong Dong <imagedong@tencent.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 2月, 2022 1 次提交
-
-
由 Eric Dumazet 提交于
UDP sendmsg() can be lockless, this is causing all kinds of data races. This patch converts sk->sk_tskey to remove one of these races. BUG: KCSAN: data-race in __ip_append_data / __ip_append_data read to 0xffff8881035d4b6c of 4 bytes by task 8877 on cpu 1: __ip_append_data+0x1c1/0x1de0 net/ipv4/ip_output.c:994 ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636 udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2413 ___sys_sendmsg net/socket.c:2467 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553 __do_sys_sendmmsg net/socket.c:2582 [inline] __se_sys_sendmmsg net/socket.c:2579 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae write to 0xffff8881035d4b6c of 4 bytes by task 8880 on cpu 0: __ip_append_data+0x1d8/0x1de0 net/ipv4/ip_output.c:994 ip_make_skb+0x13f/0x2d0 net/ipv4/ip_output.c:1636 udp_sendmsg+0x12bd/0x14c0 net/ipv4/udp.c:1249 inet_sendmsg+0x5f/0x80 net/ipv4/af_inet.c:819 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg net/socket.c:725 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2413 ___sys_sendmsg net/socket.c:2467 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553 __do_sys_sendmmsg net/socket.c:2582 [inline] __se_sys_sendmmsg net/socket.c:2579 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x0000054d -> 0x0000054e Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 8880 Comm: syz-executor.5 Not tainted 5.17.0-rc2-syzkaller-00167-gdcb85f85-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 09c2d251 ("net-timestamp: add key to disambiguate concurrent datagrams") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 2月, 2022 1 次提交
-
-
由 Jakub Kicinski 提交于
Support setting IPV6_HOPLIMIT, IPV6_TCLASS, IPV6_DONTFRAG during sendmsg via SOL_IPV6 cmsgs. tclass and dontfrag are init'ed from struct ipv6_pinfo in ipcm6_init_sk(), while hlimit is inited to -1, so we need to handle it being populated via cmsg explicitly. Leave extension headers and flowlabel unimplemented. Those are slightly more laborious to test and users seem to primarily care about IPV6_TCLASS. Signed-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 1月, 2022 7 次提交
-
-
由 Pavel Begunkov 提交于
udpv6_sendmsg() doesn't need dst after calling ip6_make_skb(), so instead of taking an additional reference inside ip6_setup_cork() and releasing the initial one afterwards, we can hand over a reference into ip6_make_skb() saving two atomics. The only other user of ip6_setup_cork() is ip6_append_data() and it requires an extra dst_hold(). Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Reviewed-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Pavel Begunkov 提交于
Another preparation patch. inet_cork_full already contains a field for iflow, so we can avoid passing a separate struct iflow6 into __ip6_append_data() and ip6_make_skb(), and use the flow stored in inet_cork_full. Make sure callers set cork->fl, i.e. we init it in ip6_append_data() and before calling ip6_make_skb(). Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Reviewed-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Pavel Begunkov 提交于
Convert a struct inet_cork argument in __ip6_append_data() to struct inet_cork_full. As one struct contains another inet_cork is still can be accessed via ->base field. It's a preparation patch making further changes a bit cleaner. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Reviewed-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Pavel Begunkov 提交于
It doesn't appear there is any reason for ip6_cork_release() to zero cork->fl, it'll be fully filled on next initialisation. This 88 bytes memset accounts to 0.3-0.5% of total CPU cycles. It's also needed in following patches and allows to remove an extar flow copy in udp_v6_push_pending_frames(). Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Reviewed-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Pavel Begunkov 提交于
Clean up ip6_setup_cork() and ip6_cork_release() adding a local variable for v6_cork->opt. It's a preparation patch for further changes. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Reviewed-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Pavel Begunkov 提交于
ipv6_push_nfrag_opts() doesn't change passed daddr, and so __ip6_make_skb() doesn't actually need to keep an on-stack copy of fl6->daddr. Set initially final_dst to fl6->daddr, ipv6_push_nfrag_opts() will override it if needed, and get rid of extra copies. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Reviewed-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Pavel Begunkov 提交于
__ip6_make_skb() gets a cork->dst ref, hands it over to skb and shortly after puts cork->dst. Save two atomics by stealing it without extra referencing, ip6_cork_release() handles NULL cork->dst. Signed-off-by: NPavel Begunkov <asml.silence@gmail.com> Reviewed-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 22 11月, 2021 1 次提交
-
-
由 Eric Dumazet 提交于
We deal with IPv6 packets, so we need to use IP6CB(skb)->flags and IP6SKB_REROUTED, instead of IPCB(skb)->flags and IPSKB_REROUTED Found by code inspection, please double check that fixing this bug does not surface other bugs. Fixes: 09ee9dba ("ipv6: Reinject IPv6 packets if IPsec policy matches after SNAT") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Tobias Brunner <tobias@strongswan.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Tested-by: NTobias Brunner <tobias@strongswan.org> Acked-by: NTobias Brunner <tobias@strongswan.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 11月, 2021 1 次提交
-
-
由 Eric Dumazet 提交于
Instead of using a full netdev_features_t, we can use a single bit, as sk_route_nocaps is only used to remove NETIF_F_GSO_MASK from sk->sk_route_cap. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 10月, 2021 1 次提交
-
-
由 Stephen Suryaputra 提交于
Commit bdb7cc64 ("ipv6: Count interface receive statistics on the ingress netdev") does not work when ip6_forward() executes on the skbs with vrf-enslaved netdev. Use IP6CB(skb)->iif to get to the right one. Add a selftest script to verify. Fixes: bdb7cc64 ("ipv6: Count interface receive statistics on the ingress netdev") Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211014130845.410602-1-ssuryaextr@gmail.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 03 8月, 2021 2 次提交
-
-
由 Vasily Averin 提交于
Unlike skb_realloc_headroom, new helper skb_expand_head does not allocate a new skb if possible. Additionally this patch replaces commonly used dereferencing with variables. Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vasily Averin 提交于
Unlike skb_realloc_headroom, new helper skb_expand_head does not allocate a new skb if possible. Additionally this patch replaces commonly used dereferencing with variables. Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 7月, 2021 1 次提交
-
-
由 Kangmin Park 提交于
Decrease hop limit counter when deliver skb to ndp proxy. Signed-off-by: NKangmin Park <l4stpr0gr4m@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 7月, 2021 1 次提交
-
-
由 Vadim Fedorenko 提交于
Replace ip6_dst_mtu_forward with ip6_dst_mtu_maybe_forward and reuse this code in ip6_mtu. Actually these two functions were almost duplicates, this change will simplify the maintaince of mtu calculation code. Signed-off-by: NVadim Fedorenko <vfedorenko@novek.ru> Reviewed-by: NDavid Ahern <dsahern@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 7月, 2021 1 次提交
-
-
由 Vasily Averin 提交于
skb_set_owner_w() should set sk not to old skb but to new nskb. Fixes: 5796015f ("ipv6: allocate enough headroom in ip6_finish_output2()") Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Link: https://lore.kernel.org/r/70c0744f-89ae-1869-7e3e-4fa292158f4b@virtuozzo.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 13 7月, 2021 1 次提交
-
-
由 Vasily Averin 提交于
When TEE target mirrors traffic to another interface, sk_buff may not have enough headroom to be processed correctly. ip_finish_output2() detect this situation for ipv4 and allocates new skb with enogh headroom. However ipv6 lacks this logic in ip_finish_output2 and it leads to skb_under_panic: skbuff: skb_under_panic: text:ffffffffc0866ad4 len:96 put:24 head:ffff97be85e31800 data:ffff97be85e317f8 tail:0x58 end:0xc0 dev:gre0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:110! invalid opcode: 0000 [#1] SMP PTI CPU: 2 PID: 393 Comm: kworker/2:2 Tainted: G OE 5.13.0 #13 Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.4 04/01/2014 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:skb_panic+0x48/0x4a Call Trace: skb_push.cold.111+0x10/0x10 ipgre_header+0x24/0xf0 [ip_gre] neigh_connected_output+0xae/0xf0 ip6_finish_output2+0x1a8/0x5a0 ip6_output+0x5c/0x110 nf_dup_ipv6+0x158/0x1000 [nf_dup_ipv6] tee_tg6+0x2e/0x40 [xt_TEE] ip6t_do_table+0x294/0x470 [ip6_tables] nf_hook_slow+0x44/0xc0 nf_hook.constprop.34+0x72/0xe0 ndisc_send_skb+0x20d/0x2e0 ndisc_send_ns+0xd1/0x210 addrconf_dad_work+0x3c8/0x540 process_one_work+0x1d1/0x370 worker_thread+0x30/0x390 kthread+0x116/0x130 ret_from_fork+0x22/0x30 Signed-off-by: NVasily Averin <vvs@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 7月, 2021 1 次提交
-
-
由 Nicolas Dichtel 提交于
The goal of commit df789fe7 ("ipv6: Provide ipv6 version of "disable_policy" sysctl") was to have the disable_policy from ipv4 available on ipv6. However, it's not exactly the same mechanism. On IPv4, all packets coming from an interface, which has disable_policy set, bypass the policy check. For ipv6, this is done only for local packets, ie for packets destinated to an address configured on the incoming interface. Let's align ipv6 with ipv4 so that the 'disable_policy' sysctl has the same effect for both protocols. My first approach was to create a new kind of route cache entries, to be able to set DST_NOPOLICY without modifying routes. This would have added a lot of code. Because the local delivery path is already handled, I choose to focus on the forwarding path to minimize code churn. Fixes: df789fe7 ("ipv6: Provide ipv6 version of "disable_policy" sysctl") Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 6月, 2021 2 次提交
-
-
由 zhang kai 提交于
parameter dst always points to null. Signed-off-by: Nzhang kai <zhangkaiheb@126.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Dave observed number of machines hitting OOM on the UDP send path. The workload seems to be sending large UDP packets over loopback. Since loopback has MTU of 64k kernel will try to allocate an skb with up to 64k of head space. This has a good chance of failing under memory pressure. What's worse if the message length is <32k the allocation may trigger an OOM killer. This is entirely avoidable, we can use an skb with page frags. af_unix solves a similar problem by limiting the head length to SKB_MAX_ALLOC. This seems like a good and simple approach. It means that UDP messages > 16kB will now use fragments if underlying device supports SG, if extra allocator pressure causes regressions in real workloads we can switch to trying the large allocation first and falling back. v4: pre-calculate all the additions to alloclen so we can be sure it won't go over order-2 Reported-by: NDave Jones <dsj@fb.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 2月, 2021 1 次提交
-
-
由 Brian Vazquez 提交于
This patch avoids the indirect call for the common case: ip6_output and ip_output Signed-off-by: NBrian Vazquez <brianvv@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 10 1月, 2021 1 次提交
-
-
由 Aya Levin 提交于
There are cases where GSO segment's length exceeds the egress MTU: - Forwarding of a TCP GRO skb, when DF flag is not set. - Forwarding of an skb that arrived on a virtualisation interface (virtio-net/vhost/tap) with TSO/GSO size set by other network stack. - Local GSO skb transmitted on an NETIF_F_TSO tunnel stacked over an interface with a smaller MTU. - Arriving GRO skb (or GSO skb in a virtualised environment) that is bridged to a NETIF_F_TSO tunnel stacked over an interface with an insufficient MTU. If so: - Consume the SKB and its segments. - Issue an ICMP packet with 'Packet Too Big' message containing the MTU, allowing the source host to reduce its Path MTU appropriately. Note: These cases are handled in the same manner in IPv4 output finish. This patch aligns the behavior of IPv6 and the one of IPv4. Fixes: 9e508490 ("netfilter: ipv6: move POSTROUTING invocation before fragmentation") Signed-off-by: NAya Levin <ayal@nvidia.com> Reviewed-by: NTariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/1610027418-30438-1-git-send-email-ayal@nvidia.comSigned-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 08 1月, 2021 3 次提交
-
-
由 Jonathan Lemon 提交于
Unlike the rest of the skb_zcopy_ functions, these routines operate on a 'struct ubuf', not a skb. Remove the 'skb_' prefix from the naming to make things clearer. Suggested-by: NWillem de Bruijn <willemdebruijn.kernel@gmail.com> Signed-off-by: NJonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jonathan Lemon 提交于
At Willem's suggestion, rename the sock_zerocopy_* functions so that they match the MSG_ZEROCOPY flag, which makes it clear they are specific to this zerocopy implementation. Signed-off-by: NJonathan Lemon <jonathan.lemon@gmail.com> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
由 Jonathan Lemon 提交于
The sock_zerocopy_put_abort function contains logic which is specific to the current zerocopy implementation. Add a wrapper which checks the callback and dispatches apppropriately. Signed-off-by: NJonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 15 10月, 2020 1 次提交
-
-
由 Mathieu Desnoyers 提交于
As per RFC4443, the destination address field for ICMPv6 error messages is copied from the source address field of the invoking packet. In configurations with Virtual Routing and Forwarding tables, looking up which routing table to use for sending ICMPv6 error messages is currently done by using the destination net_device. If the source and destination interfaces are within separate VRFs, or one in the global routing table and the other in a VRF, looking up the source address of the invoking packet in the destination interface's routing table will fail if the destination interface's routing table contains no route to the invoking packet's source address. One observable effect of this issue is that traceroute6 does not work in the following cases: - Route leaking between global routing table and VRF - Route leaking between VRFs Use the source device routing table when sending ICMPv6 error messages. [ In the context of ipv4, it has been pointed out that a similar issue may exist with ICMP errors triggered when forwarding between network namespaces. It would be worthwhile to investigate whether ipv6 has similar issues, but is outside of the scope of this investigation. ] [ Testing shows that similar issues exist with ipv6 unreachable / fragmentation needed messages. However, investigation of this additional failure mode is beyond this investigation's scope. ] Link: https://tools.ietf.org/html/rfc4443Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NJakub Kicinski <kuba@kernel.org>
-
- 19 9月, 2020 1 次提交
-
-
由 Randy Dunlap 提交于
Drop repeated words in net/ipv6/. Signed-off-by: NRandy Dunlap <rdunlap@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 8月, 2020 1 次提交
-
-
由 Al Viro 提交于
it's always 0 Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 14 7月, 2020 1 次提交
-
-
由 Andrew Lunn 提交于
Simple fixes which require no deep knowledge of the code. Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: NAndrew Lunn <andrew@lunn.ch> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 2月, 2020 1 次提交
-
-
由 Martin Varghese 提交于
The Bareudp tunnel module provides a generic L3 encapsulation tunnelling module for tunnelling different protocols like MPLS, IP,NSH etc inside a UDP tunnel. Signed-off-by: NMartin Varghese <martin.varghese@nokia.com> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 12月, 2019 1 次提交
-
-
由 Sabrina Dubroca 提交于
This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow, as some modules currently pass a net argument without a socket to ip6_dst_lookup. This is equivalent to commit 343d60aa ("ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument"). Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 11月, 2019 1 次提交
-
-
由 Phil Sutter 提交于
Instead of generally passing NULL to NF_HOOK_COND() for input device, pass skb->dev which contains input device for routed skbs. Note that iptables (both legacy and nft) reject rules with input interface match from being added to POSTROUTING chains, but nftables allows this. Cc: Eric Garver <eric@garver.life> Signed-off-by: NPhil Sutter <phil@nwl.cc> Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
-
- 19 10月, 2019 1 次提交
-
-
由 Eric Dumazet 提交于
Thomas found that some forwarded packets would be stuck in FQ packet scheduler because their skb->tstamp contained timestamps far in the future. We thought we addressed this point in commit 8203e2d8 ("net: clear skb->tstamp in forwarding paths") but there is still an issue when/if a packet needs to be fragmented. In order to meet EDT requirements, we have to make sure all fragments get the original skb->tstamp. Note that this original skb->tstamp should be zero in forwarding path, but might have a non zero value in output path if user decided so. Fixes: fb420d5d ("tcp/fq: move back to CLOCK_MONOTONIC") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NThomas Bartschies <Thomas.Bartschies@cvk.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 9月, 2019 1 次提交
-
-
由 Eric Dumazet 提交于
Currently, ip6_xmit() sets skb->priority based on sk->sk_priority This is not desirable for TCP since TCP shares the same ctl socket for a given netns. We want to be able to send RST or ACK packets with a non zero skb->priority. This patch has no functional change. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 9月, 2019 1 次提交
-
-
由 Willem de Bruijn 提交于
Enable setting skb->mark for UDP and RAW sockets using cmsg. This is analogous to existing support for TOS, TTL, txtime, etc. Packet sockets already support this as of commit c7d39e32 ("packet: support per-packet fwmark for af_packet sendmsg"). Similar to other fields, implement by 1. initialize the sockcm_cookie.mark from socket option sk_mark 2. optionally overwrite this in ip_cmsg_send/ip6_datagram_send_ctl 3. initialize inet_cork.mark from sockcm_cookie.mark 4. initialize each (usually just one) skb->mark from inet_cork.mark Step 1 is handled in one location for most protocols by ipcm_init_sk as of commit 35178206 ("ipv4: ipcm_cookie initializers"). Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 6月, 2019 1 次提交
-
-
由 Nicolas Dichtel 提交于
There is no functional change in this patch, it only prepares the next one. rt6_nexthop() will be used by ip6_dst_lookup_neigh(), which uses const variables. Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com> Reported-by: Nkbuild test robot <lkp@intel.com> Acked-by: NNick Desaulniers <ndesaulniers@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 6月, 2019 1 次提交
-
-
由 Willem de Bruijn 提交于
The below patch fixes an incorrect zerocopy refcnt increment when appending with MSG_MORE to an existing zerocopy udp skb. send(.., MSG_ZEROCOPY | MSG_MORE); // refcnt 1 send(.., MSG_ZEROCOPY | MSG_MORE); // refcnt still 1 (bar frags) But it missed that zerocopy need not be passed at the first send. The right test whether the uarg is newly allocated and thus has extra refcnt 1 is not !skb, but !skb_zcopy. send(.., MSG_MORE); // <no uarg> send(.., MSG_ZEROCOPY); // refcnt 1 Fixes: 100f6d8e ("net: correct zerocopy refcnt with udp MSG_MORE") Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 6月, 2019 1 次提交
-
-
由 Eric Dumazet 提交于
syzbot reported nasty use-after-free [1] Lets remove frag_list field from structs ip_fraglist_iter and ip6_fraglist_iter. This seens not needed anyway. [1] : BUG: KASAN: use-after-free in kfree_skb_list+0x5d/0x60 net/core/skbuff.c:706 Read of size 8 at addr ffff888085a3cbc0 by task syz-executor303/8947 CPU: 0 PID: 8947 Comm: syz-executor303 Not tainted 5.2.0-rc2+ #12 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188 __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 kasan_report+0x12/0x20 mm/kasan/common.c:614 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132 kfree_skb_list+0x5d/0x60 net/core/skbuff.c:706 ip6_fragment+0x1ef4/0x2680 net/ipv6/ip6_output.c:882 __ip6_finish_output+0x577/0xaa0 net/ipv6/ip6_output.c:144 ip6_finish_output+0x38/0x1f0 net/ipv6/ip6_output.c:156 NF_HOOK_COND include/linux/netfilter.h:294 [inline] ip6_output+0x235/0x7f0 net/ipv6/ip6_output.c:179 dst_output include/net/dst.h:433 [inline] ip6_local_out+0xbb/0x1b0 net/ipv6/output_core.c:179 ip6_send_skb+0xbb/0x350 net/ipv6/ip6_output.c:1796 ip6_push_pending_frames+0xc8/0xf0 net/ipv6/ip6_output.c:1816 rawv6_push_pending_frames net/ipv6/raw.c:617 [inline] rawv6_sendmsg+0x2993/0x35e0 net/ipv6/raw.c:947 inet_sendmsg+0x141/0x5d0 net/ipv4/af_inet.c:802 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:671 ___sys_sendmsg+0x803/0x920 net/socket.c:2292 __sys_sendmsg+0x105/0x1d0 net/socket.c:2330 __do_sys_sendmsg net/socket.c:2339 [inline] __se_sys_sendmsg net/socket.c:2337 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2337 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x44add9 Code: e8 7c e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 1b 05 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f826f33bce8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000006e7a18 RCX: 000000000044add9 RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000005 RBP: 00000000006e7a10 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006e7a1c R13: 00007ffcec4f7ebf R14: 00007f826f33c9c0 R15: 20c49ba5e353f7cf Allocated by task 8947: save_stack+0x23/0x90 mm/kasan/common.c:71 set_track mm/kasan/common.c:79 [inline] __kasan_kmalloc mm/kasan/common.c:489 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:462 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:497 slab_post_alloc_hook mm/slab.h:437 [inline] slab_alloc_node mm/slab.c:3269 [inline] kmem_cache_alloc_node+0x131/0x710 mm/slab.c:3579 __alloc_skb+0xd5/0x5e0 net/core/skbuff.c:199 alloc_skb include/linux/skbuff.h:1058 [inline] __ip6_append_data.isra.0+0x2a24/0x3640 net/ipv6/ip6_output.c:1519 ip6_append_data+0x1e5/0x320 net/ipv6/ip6_output.c:1688 rawv6_sendmsg+0x1467/0x35e0 net/ipv6/raw.c:940 inet_sendmsg+0x141/0x5d0 net/ipv4/af_inet.c:802 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:671 ___sys_sendmsg+0x803/0x920 net/socket.c:2292 __sys_sendmsg+0x105/0x1d0 net/socket.c:2330 __do_sys_sendmsg net/socket.c:2339 [inline] __se_sys_sendmsg net/socket.c:2337 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2337 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 8947: save_stack+0x23/0x90 mm/kasan/common.c:71 set_track mm/kasan/common.c:79 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:451 kasan_slab_free+0xe/0x10 mm/kasan/common.c:459 __cache_free mm/slab.c:3432 [inline] kmem_cache_free+0x86/0x260 mm/slab.c:3698 kfree_skbmem net/core/skbuff.c:625 [inline] kfree_skbmem+0xc5/0x150 net/core/skbuff.c:619 __kfree_skb net/core/skbuff.c:682 [inline] kfree_skb net/core/skbuff.c:699 [inline] kfree_skb+0xf0/0x390 net/core/skbuff.c:693 kfree_skb_list+0x44/0x60 net/core/skbuff.c:708 __dev_xmit_skb net/core/dev.c:3551 [inline] __dev_queue_xmit+0x3034/0x36b0 net/core/dev.c:3850 dev_queue_xmit+0x18/0x20 net/core/dev.c:3914 neigh_direct_output+0x16/0x20 net/core/neighbour.c:1532 neigh_output include/net/neighbour.h:511 [inline] ip6_finish_output2+0x1034/0x2550 net/ipv6/ip6_output.c:120 ip6_fragment+0x1ebb/0x2680 net/ipv6/ip6_output.c:863 __ip6_finish_output+0x577/0xaa0 net/ipv6/ip6_output.c:144 ip6_finish_output+0x38/0x1f0 net/ipv6/ip6_output.c:156 NF_HOOK_COND include/linux/netfilter.h:294 [inline] ip6_output+0x235/0x7f0 net/ipv6/ip6_output.c:179 dst_output include/net/dst.h:433 [inline] ip6_local_out+0xbb/0x1b0 net/ipv6/output_core.c:179 ip6_send_skb+0xbb/0x350 net/ipv6/ip6_output.c:1796 ip6_push_pending_frames+0xc8/0xf0 net/ipv6/ip6_output.c:1816 rawv6_push_pending_frames net/ipv6/raw.c:617 [inline] rawv6_sendmsg+0x2993/0x35e0 net/ipv6/raw.c:947 inet_sendmsg+0x141/0x5d0 net/ipv4/af_inet.c:802 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:671 ___sys_sendmsg+0x803/0x920 net/socket.c:2292 __sys_sendmsg+0x105/0x1d0 net/socket.c:2330 __do_sys_sendmsg net/socket.c:2339 [inline] __se_sys_sendmsg net/socket.c:2337 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2337 do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff888085a3cbc0 which belongs to the cache skbuff_head_cache of size 224 The buggy address is located 0 bytes inside of 224-byte region [ffff888085a3cbc0, ffff888085a3cca0) The buggy address belongs to the page: page:ffffea0002168f00 refcount:1 mapcount:0 mapping:ffff88821b6f63c0 index:0x0 flags: 0x1fffc0000000200(slab) raw: 01fffc0000000200 ffffea00027bbf88 ffffea0002105b88 ffff88821b6f63c0 raw: 0000000000000000 ffff888085a3c080 000000010000000c 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888085a3ca80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888085a3cb00: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc >ffff888085a3cb80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ^ ffff888085a3cc00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888085a3cc80: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc Fixes: 0feca619 ("net: ipv6: add skbuff fraglist splitter") Fixes: c8b17be0 ("net: ipv4: add skbuff fraglist splitter") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: NPablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-