提交 · e6ca4f16a6950537132d0752581c967df2d911c4 · openeuler / Kernel

08 11月, 2016 1 次提交

udp: do fwd memory scheduling on dequeue · 7c13f97f

由 Paolo Abeni 提交于 11月 04, 2016

A new argument is added to __skb_recv_datagram to provide
an explicit skb destructor, invoked under the receive queue
lock.
The UDP protocol uses such argument to perform memory
reclaiming on dequeue, so that the UDP protocol does not
set anymore skb->desctructor.
Instead explicit memory reclaiming is performed at close() time and
when skbs are removed from the receive queue.
The in kernel UDP protocol users now need to call a
skb_recv_udp() variant instead of skb_recv_datagram() to
properly perform memory accounting on dequeue.

Overall, this allows acquiring only once the receive queue
lock on dequeue.

Tested using pktgen with random src port, 64 bytes packet,
wire-speed on a 10G link as sender and udp_sink as the receiver,
using an l4 tuple rxhash to stress the contention, and one or more
udp_sink instances with reuseport.

nr sinks	vanilla		patched
1		440		560
3		2150		2300
6		3650		3800
9		4450		4600
12		6250		6450

v1 -> v2:
 - do rmem and allocated memory scheduling under the receive lock
 - do bulk scheduling in first_packet_length() and in udp_destruct_sock()
 - avoid the typdef for the dequeue callback
Suggested-by: NEric Dumazet <edumazet@google.com>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c13f97f

23 10月, 2016 1 次提交

udp: implement memory accounting helpers · f970bd9e

由 Paolo Abeni 提交于 10月 21, 2016

Avoid using the generic helpers.
Use the receive queue spin lock to protect the memory
accounting operation, both on enqueue and on dequeue.

On dequeue perform partial memory reclaiming, trying to
leave a quantum of forward allocated memory.

On enqueue use a custom helper, to allow some optimizations:
- use a plain spin_lock() variant instead of the slightly
  costly spin_lock_irqsave(),
- avoid dst_force check, since the calling code has already
  dropped the skb dst
- avoid orphaning the skb, since skb_steal_sock() already did
  the work for us

The above needs custom memory reclaiming on shutdown, provided
by the udp_destruct_sock().

v5 -> v6:
  - don't orphan the skb on enqueue

v4 -> v5:
  - replace the mem_lock with the receive queue spin lock
  - ensure that the bh is always allowed to enqueue at least
    a skb, even if sk_rcvbuf is exceeded

v3 -> v4:
  - reworked memory accunting, simplifying the schema
  - provide an helper for both memory scheduling and enqueuing

v1 -> v2:
  - use a udp specific destrctor to perform memory reclaiming
  - remove a couple of helpers, unneeded after the above cleanup
  - do not reclaim memory on dequeue if not under memory
    pressure
  - reworked the fwd accounting schema to avoid potential
    integer overflow
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f970bd9e

21 10月, 2016 1 次提交

udp: must lock the socket in udp_disconnect() · 286c72de

由 Eric Dumazet 提交于 10月 20, 2016

Baozeng Ding reported KASAN traces showing uses after free in
udp_lib_get_port() and other related UDP functions.

A CONFIG_DEBUG_PAGEALLOC=y kernel would eventually crash.

I could write a reproducer with two threads doing :

static int sock_fd;
static void *thr1(void *arg)
{
	for (;;) {
		connect(sock_fd, (const struct sockaddr *)arg,
			sizeof(struct sockaddr_in));
	}
}

static void *thr2(void *arg)
{
	struct sockaddr_in unspec;

	for (;;) {
		memset(&unspec, 0, sizeof(unspec));
	        connect(sock_fd, (const struct sockaddr *)&unspec,
			sizeof(unspec));
        }
}

Problem is that udp_disconnect() could run without holding socket lock,
and this was causing list corruptions.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NBaozeng Ding <sploving1@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

286c72de

24 8月, 2016 1 次提交

net: diag: support SOCK_DESTROY for UDP sockets · 5d77dca8

由 David Ahern 提交于 8月 23, 2016

This implements SOCK_DESTROY for UDP sockets similar to what was done
for TCP with commit c1e64e29 ("net: diag: Support destroying TCP
sockets.") A process with a UDP socket targeted for destroy is awakened
and recvmsg fails with ECONNABORTED.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5d77dca8

02 6月, 2016 1 次提交

udp: avoid csum_partial() for validated skb · 595d0b29

由 Eric Dumazet 提交于 5月 31, 2016

In commit e6afc8ac ("udp: remove headers from UDP packets before
queueing"), udp_csum_pull_header() helper was added but missed fact
that CHECKSUM_UNNECESSARY packets were now converted to CHECKSUM_NONE
and skb->csum_valid was set to 1 for them.

Since csum_partial() is quite expensive, even for 8-byte area, it is
worth adding a test.

We also can use skb->data instead of udp_hdr() as we are pulling
UDP headers, as it is sightly faster.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

595d0b29

28 4月, 2016 3 次提交

net: snmp: kill STATS_BH macros · 13415e46

由 Eric Dumazet 提交于 4月 27, 2016

There is nothing related to BH in SNMP counters anymore,
since linux-3.0.

Rename helpers to use __ prefix instead of _BH prefix,
for contexts where preemption is disabled.

This more closely matches convention used to update
percpu variables.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

13415e46

net: udp: rename UDP_INC_STATS_BH() · 02c22347

由 Eric Dumazet 提交于 4月 27, 2016

Rename UDP_INC_STATS_BH() to __UDP_INC_STATS(),
and UDP6_INC_STATS_BH() to __UDP6_INC_STATS()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

02c22347

net: snmp: kill various STATS_USER() helpers · 6aef70a8

由 Eric Dumazet 提交于 4月 27, 2016

In the old days (before linux-3.0), SNMP counters were duplicated,
one for user context, and one for BH context.

After commit 8f0ea0fe ("snmp: reduce percpu needs by 50%")
we have a single copy, and what really matters is preemption being
enabled or disabled, since we use this_cpu_inc() or __this_cpu_inc()
respectively.

We therefore kill SNMP_INC_STATS_USER(), SNMP_ADD_STATS_USER(),
NET_INC_STATS_USER(), NET_ADD_STATS_USER(), SCTP_INC_STATS_USER(),
SNMP_INC_STATS64_USER(), SNMP_ADD_STATS64_USER(), TCP_ADD_STATS_USER(),
UDP_INC_STATS_USER(), UDP6_INC_STATS_USER(), and XFRM_INC_STATS_USER()

Following patches will rename __BH helpers to make clear their
usage is not tied to BH being disabled.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6aef70a8

08 4月, 2016 2 次提交

udp: Add GRO functions to UDP socket · a6024562

由 Tom Herbert 提交于 4月 05, 2016

This patch adds GRO functions (gro_receive and gro_complete) to UDP
sockets. udp_gro_receive is changed to perform socket lookup on a
packet. If a socket is found the related GRO functions are called.

This features obsoletes using UDP offload infrastructure for GRO
(udp_offload). This has the advantage of not being limited to provide
offload on a per port basis, GRO is now applied to whatever individual
UDP sockets are bound to.  This also allows the possbility of
"application defined GRO"-- that is we can attach something like
a BPF program to a UDP socket to perfrom GRO on an application
layer protocol.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6024562

udp: Add udp6_lib_lookup_skb and udp4_lib_lookup_skb · 63058308

由 Tom Herbert 提交于 4月 05, 2016

Add externally visible functions to lookup a UDP socket by skb. This
will be used for GRO in UDP sockets. These functions also check
if skb->dst is set, and if it is not skb->dev is used to get dev_net.
This allows calling lookup functions before dst has been set on the
skbuff.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63058308

06 4月, 2016 1 次提交

udp: remove headers from UDP packets before queueing · e6afc8ac

由 samanthakumar 提交于 4月 05, 2016

Remove UDP transport headers before queueing packets for reception.
This change simplifies a follow-up patch to add MSG_PEEK support.
Signed-off-by: NSam Kumar <samanthakumar@google.com>
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e6afc8ac

05 4月, 2016 1 次提交

udp: no longer use SLAB_DESTROY_BY_RCU · ca065d0c

由 Eric Dumazet 提交于 4月 01, 2016

Tom Herbert would like not touching UDP socket refcnt for encapsulated
traffic. For this to happen, we need to use normal RCU rules, with a grace
period before freeing a socket. UDP sockets are not short lived in the
high usage case, so the added cost of call_rcu() should not be a concern.

This actually removes a lot of complexity in UDP stack.

Multicast receives no longer need to hold a bucket spinlock.

Note that ip early demux still needs to take a reference on the socket.

Same remark for functions used by xt_socket and xt_PROXY netfilter modules,
but this might be changed later.

Performance for a single UDP socket receiving flood traffic from
many RX queues/cpus.

Simple udp_rx using simple recvfrom() loop :
438 kpps instead of 374 kpps : 17 % increase of the peak rate.

v2: Addressed Willem de Bruijn feedback in multicast handling
 - keep early demux break in __udp4_lib_demux_lookup()
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Cc: Willem de Bruijn <willemb@google.com>
Tested-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca065d0c

11 2月, 2016 1 次提交

sock: struct proto hash function may error · 086c653f

由 Craig Gallek 提交于 2月 10, 2016

In order to support fast reuseport lookups in TCP, the hash function
defined in struct proto must be capable of returning an error code.
This patch changes the function signature of all related hash functions
to return an integer and handles or propagates this return value at
all call sites.
Signed-off-by: NCraig Gallek <kraig@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

086c653f

05 1月, 2016 2 次提交

soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF · 538950a1

由 Craig Gallek 提交于 1月 04, 2016

Expose socket options for setting a classic or extended BPF program
for use when selecting sockets in an SO_REUSEPORT group.  These options
can be used on the first socket to belong to a group before bind or
on any socket in the group after bind.

This change includes refactoring of the existing sk_filter code to
allow reuse of the existing BPF filter validation checks.
Signed-off-by: NCraig Gallek <kraig@google.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

538950a1

soreuseport: fast reuseport UDP socket selection · e32ea7e7

由 Craig Gallek 提交于 1月 04, 2016

Include a struct sock_reuseport instance when a UDP socket binds to
a specific address for the first time with the reuseport flag set.
When selecting a socket for an incoming UDP packet, use the information
available in sock_reuseport if present.

This required adding an additional field to the UDP source address
equality function to differentiate between exact and wildcard matches.
The original use case allowed wildcard matches when checking for
existing port uses during bind.  The new use case of adding a socket
to a reuseport group requires exact address matching.

Performance test (using a machine with 2 CPU sockets and a total of
48 cores):  Create reuseport groups of varying size.  Use one socket
from this group per user thread (pinning each thread to a different
core) calling recvmmsg in a tight loop.  Record number of messages
received per second while saturating a 10G link.
  10 sockets: 18% increase (~2.8M -> 3.3M pkts/s)
  20 sockets: 14% increase (~2.9M -> 3.3M pkts/s)
  40 sockets: 13% increase (~3.0M -> 3.4M pkts/s)

This work is based off a similar implementation written by
Ying Cai <ycai@google.com> for implementing policy-based reuseport
selection.
Signed-off-by: NCraig Gallek <kraig@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e32ea7e7

03 3月, 2015 1 次提交

net: Remove iocb argument from sendmsg and recvmsg · 1b784140

由 Ying Xue 提交于 3月 02, 2015

After TIPC doesn't depend on iocb argument in its internal
implementations of sendmsg() and recvmsg() hooks defined in proto
structure, no any user is using iocb argument in them at all now.
Then we can drop the redundant iocb argument completely from kinds of
implementations of both sendmsg() and recvmsg() in the entire
networking stack.

Cc: Christoph Hellwig <hch@lst.de>
Suggested-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b784140

28 2月, 2015 1 次提交

udp: In udp_flow_src_port use random hash value if skb_get_hash fails · 723b8e46

由 Tom Herbert 提交于 2月 24, 2015

In the unlikely event that skb_get_hash is unable to deduce a hash
in udp_flow_src_port we use a consistent random value instead.
This is specified in GRE/UDP draft section 3.2.1:
https://tools.ietf.org/html/draft-ietf-tsvwg-gre-in-udp-encap-04Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

723b8e46

02 10月, 2014 1 次提交

udp: Generalize skb_udp_segment · 8bce6d7d

由 Tom Herbert 提交于 9月 29, 2014

skb_udp_segment is the function called from udp4_ufo_fragment to
segment a UDP tunnel packet. This function currently assumes
segmentation is transparent Ethernet bridging (i.e. VXLAN
encapsulation). This patch generalizes the function to
operate on either Ethertype or IP protocol.

The inner_protocol field must be set to the protocol of the inner
header. This can now be either an Ethertype or an IP protocol
(in a union). A new flag in the skbuff indicates which type is
effective. skb_set_inner_protocol and skb_set_inner_ipproto
helper functions were added to set the inner_protocol. These
functions are called from the point where the tunnel encapsulation
is occuring.

When skb_udp_tunnel_segment is called, the function to segment the
inner packet is selected based on the inner IP or Ethertype. In the
case of an IP protocol encapsulation, the function is derived from
inet[6]_offloads. In the case of Ethertype, skb->protocol is
set to the inner_protocol and skb_mac_gso_segment is called. (GRE
currently does this, but it might be possible to lookup the protocol
in offload_base and call the appropriate segmenation function
directly).
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8bce6d7d

25 8月, 2014 1 次提交

udp: additional GRO support · 57c67ff4

由 Tom Herbert 提交于 8月 22, 2014

Implement GRO for UDPv6. Add UDP checksum verification in gro_receive
for both UDP4 and UDP6 calling skb_gro_checksum_validate_zero_check.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

57c67ff4

08 7月, 2014 1 次提交

udp: Add function to make source port for UDP tunnels · b8f1a556

由 Tom Herbert 提交于 7月 01, 2014

This patch adds udp_flow_src_port function which is intended to be
a common function that UDP tunnel implementations call to set the source
port. The source port is chosen so that a hash over the outer headers
(IP addresses and UDP ports) acts as suitable hash for the flow of the
encapsulated packet. In this manner, UDP encapsulation works with RSS
and ECMP based wrt the inner flow.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8f1a556

15 6月, 2014 1 次提交

udp: call __skb_checksum_complete when doing full checksum · bbdff225

由 Tom Herbert 提交于 6月 14, 2014

In __udp_lib_checksum_complete check if checksum is being done over all
the data (len is equal to skb->len) and if it is call
__skb_checksum_complete instead of __skb_checksum_complete_head. This
allows checksum to be saved in checksum complete.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bbdff225

05 6月, 2014 1 次提交

udp: Generic functions to set checksum · af5fcba7

由 Tom Herbert 提交于 6月 04, 2014

Added udp_set_csum and udp6_set_csum functions to set UDP checksums
in packets. These are for simple UDP packets such as those that might
be created in UDP tunnels.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

af5fcba7

24 5月, 2014 1 次提交

net: Split sk_no_check into sk_no_check_{rx,tx} · 28448b80

由 Tom Herbert 提交于 5月 23, 2014

Define separate fields in the sock structure for configuring disabling
checksums in both TX and RX-- sk_no_check_tx and sk_no_check_rx.
The SO_NO_CHECK socket option only affects sk_no_check_tx. Also,
removed UDP_CSUM_* defines since they are no longer necessary.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

28448b80

07 11月, 2013 1 次提交

udp: Remove unnecessary semicolon from do{}while (0) macro · f9bddcdf

由 Joe Perches 提交于 11月 05, 2013

Just an unnecessary semicolon that should be removed...

Whitespace neatening of macro too.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f9bddcdf

09 10月, 2013 1 次提交

udp: ipv4: Add udp early demux · 421b3885

由 Shawn Bohrer 提交于 10月 07, 2013

The removal of the routing cache introduced a performance regression for
some UDP workloads since a dst lookup must be done for each packet.
This change caches the dst per socket in a similar manner to what we do
for TCP by implementing early_demux.

For UDP multicast we can only cache the dst if there is only one
receiving socket on the host.  Since caching only works when there is
one receiving socket we do the multicast socket lookup using RCU.

For UDP unicast we only demux sockets with an exact match in order to
not break forwarding setups.  Additionally since the hash chains may be
long we only check the first socket to see if it is a match and not
waste extra time searching the whole chain when we might not find an
exact match.

Benchmark results from a netperf UDP_RR test:
Before 87961.22 transactions/s
After  89789.68 transactions/s

Benchmark results from a fio 1 byte UDP multicast pingpong test
(Multicast one way unicast response):
Before 12.97us RTT
After  12.63us RTT
Signed-off-by: NShawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

421b3885

24 9月, 2013 1 次提交

udp.h: Remove extern from function prototypes · 1a50bd06

由 Joe Perches 提交于 9月 23, 2013

There are a mix of function prototypes with and without extern
in the kernel sources.  Standardize on not using extern for
function prototypes.

Function prototypes don't need to be written with extern.
extern is assumed by the compiler.  Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1a50bd06

28 7月, 2013 1 次提交

pktgen: Add UDPCSUM flag to support UDP checksums · c26bf4a5

由 Thomas Graf 提交于 7月 25, 2013

UDP checksums are optional, hence pktgen has been omitting them in
favour of performance. The optional flag UDPCSUM enables UDP
checksumming. If the output device supports hardware checksumming
the skb is prepared and marked CHECKSUM_PARTIAL, otherwise the
checksum is generated in software.
Signed-off-by: NThomas Graf <tgraf@suug.ch>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Ben Greear <greearb@candelatech.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c26bf4a5

03 7月, 2013 1 次提交

ipv6: call udp_push_pending_frames when uncorking a socket with AF_INET pending data · 8822b64a

由 Hannes Frederic Sowa 提交于 7月 01, 2013

We accidentally call down to ip6_push_pending_frames when uncorking
pending AF_INET data on a ipv6 socket. This results in the following
splat (from Dave Jones):

skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:126!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth
+netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c
CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37
task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000
RIP: 0010:[<ffffffff816e759c>]  [<ffffffff816e759c>] skb_panic+0x63/0x65
RSP: 0018:ffff8801e6431de8  EFLAGS: 00010282
RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006
RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520
RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800
R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800
FS:  00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
Stack:
 ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4
 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6
 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0
Call Trace:
 [<ffffffff8159a9aa>] skb_push+0x3a/0x40
 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0
 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140
 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0
 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20
 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0
 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0
 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40
 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20
 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0
 [<ffffffff816f5d54>] tracesys+0xdd/0xe2
Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55
RIP  [<ffffffff816e759c>] skb_panic+0x63/0x65
 RSP <ffff8801e6431de8>

This patch adds a check if the pending data is of address family AF_INET
and directly calls udp_push_ending_frames from udp_v6_push_pending_frames
if that is the case.

This bug was found by Dave Jones with trinity.

(Also move the initialization of fl6 below the AF_INET check, even if
not strictly necessary.)

Cc: Dave Jones <davej@redhat.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8822b64a

12 6月, 2013 1 次提交

net: udp4: move GSO functions to udp_offload · da5bab07

由 Daniel Borkmann 提交于 6月 08, 2013

Similarly to TCP offloading and UDPv6 offloading, move all related
UDPv4 functions to udp_offload.c to make things more explicit. Also,
by this, we can make those functions static.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da5bab07

29 4月, 2012 1 次提交

net/ipv6/udp: UDP encapsulation: introduce encap_rcv hook into IPv6 · d7f3f621

由 Benjamin LaHaise 提交于 4月 27, 2012

Now that the sematics of udpv6_queue_rcv_skb() match IPv4's
udp_queue_rcv_skb(), introduce the UDP encap_rcv() hook for IPv6.
Signed-off-by: NBenjamin LaHaise <bcrl@kvack.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7f3f621

16 4月, 2012 1 次提交

net: cleanup unsigned to unsigned int · 95c96174

由 Eric Dumazet 提交于 4月 15, 2012

Use of "unsigned int" is preferred to bare "unsigned" in net tree.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95c96174

14 4月, 2012 1 次提交

udp: intoduce udp_encap_needed static_key · 447167bf

由 Eric Dumazet 提交于 4月 11, 2012

Most machines dont use UDP encapsulation (L2TP)

Adds a static_key so that udp_queue_rcv_skb() doesnt have to perform a
test if L2TP never setup the encap_rcv on a socket.

Idea of this patch came after Simon Horman proposal to add a hook on TCP
as well.

If static_key is not yet enabled, the fast path does a single JMP .

When static_key is enabled, JMP destination is patched to reach the real
encap_type/encap_rcv logic, possibly adding cache misses.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: dev@openvswitch.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

447167bf

05 3月, 2012 1 次提交

BUG: headers with BUG/BUG_ON etc. need linux/bug.h · 187f1882

由 Paul Gortmaker 提交于 11月 23, 2011

If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any
other BUG variant in a static inline (i.e. not in a #define) then
that header really should be including <linux/bug.h> and not just
expecting it to be implicitly present.

We can make this change risk-free, since if the files using these
headers didn't have exposure to linux/bug.h already, they would have
been causing compile failures/warnings.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

187f1882

12 12月, 2011 1 次提交

net: use IS_ENABLED(CONFIG_IPV6) · dfd56b8b

由 Eric Dumazet 提交于 12月 10, 2011

Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dfd56b8b

10 12月, 2011 1 次提交

udp: Export code sk lookup routines · fce82338

由 Pavel Emelyanov 提交于 12月 09, 2011

The UDP diag get_exact handler will require them to find a
socket by provided net, [sd]addr-s, [sd]ports and device.
Signed-off-by: NPavel Emelyanov <xemul@parallels.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fce82338

17 11月, 2011 1 次提交

net: introduce and use netdev_features_t for device features sets · c8f44aff

由 Michał Mirosław 提交于 11月 15, 2011

v2:	add couple missing conversions in drivers
	split unexporting netdev_fix_features()
	implemented %pNF
	convert sock::sk_route_(no?)caps
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c8f44aff

02 11月, 2011 1 次提交

net: make the tcp and udp file_operations for the /proc stuff const · 73cb88ec

由 Arjan van de Ven 提交于 10月 30, 2011

the tcp and udp code creates a set of struct file_operations at runtime
while it can also be done at compile time, with the added benefit of then
having these file operations be const.

the trickiest part was to get the "THIS_MODULE" reference right; the naive
method of declaring a struct in the place of registration would not work
for this reason.
Signed-off-by: NArjan van de Ven <arjan@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73cb88ec

02 3月, 2011 1 次提交

udp: Switch to ip_finish_skb · f6b9664f

由 Herbert Xu 提交于 3月 01, 2011

This patch converts UDP to use the new ip_finish_skb API.  This
would then allows us to more easily use ip_make_skb which allows
UDP to run without a socket lock.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6b9664f

25 1月, 2011 1 次提交

net: change netdev->features to u32 · 04ed3e74

由 Michał Mirosław 提交于 1月 24, 2011

Quoting Ben Hutchings: we presumably won't be defining features that
can only be enabled on 64-bit architectures.

Occurences found by `grep -r` on net/, drivers/net, include/

[ Move features and vlan_features next to each other in
  struct netdev, as per Eric Dumazet's suggestion -DaveM ]
Signed-off-by: NMichał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04ed3e74

11 11月, 2010 1 次提交

net: avoid limits overflow · 8d987e5c

由 Eric Dumazet 提交于 11月 09, 2010

Robin Holt tried to boot a 16TB machine and found some limits were
reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

We can switch infrastructure to use long "instead" of "int", now
atomic_long_t primitives are available for free.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Reported-by: NRobin Holt <holt@sgi.com>
Reviewed-by: NRobin Holt <holt@sgi.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d987e5c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功