提交 · 0cd2950357e31a96be03b531b4b11fe1df812c9f · openanolis / cloud-kernel

18 5月, 2017 40 次提交

net: make struct net_device::tx_queue_len unsigned int · 0cd29503

由 Alexey Dobriyan 提交于 5月 17, 2017

4 billion packet queue is something unthinkable so use 32-bit value
for now.

Space savings on x86_64:

	add/remove: 0/0 grow/shrink: 3/70 up/down: 16/-131 (-115)
	function                                     old     new   delta
	change_tx_queue_len                           94     108     +14
	qdisc_create                                1176    1177      +1
	alloc_netdev_mqs                            1124    1125      +1
	xenvif_alloc                                 533     532      -1
	x25_asy_setup                                167     166      -1
			...
	tun_queue_resize                             945     940      -5
	pfifo_fast_enqueue                           167     162      -5
	qfq_init_qdisc                               168     158     -10
	tap_queue_resize                             810     799     -11
	transmit                                     719     698     -21
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0cd29503

udp: make function udp_skb_dtor_locked static · 64f5102d

由 Colin Ian King 提交于 5月 17, 2017

Function udp_skb_dtor_locked does not need to be in global scope
so make it static to fix sparse warning:

net/ipv4/udp.c: warning: symbol 'udp_skb_dtor_locked' was not
declared. Should it be static?

Fixes: 6dfb4367 ("udp: keep the sk_receive_queue held when splicing")
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Acked-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

64f5102d

Merge branch 'vhost_net-rx-batch-dequeuing' · f646c75b

由 David S. Miller 提交于 5月 18, 2017

Jason Wang says:

====================
vhost_net rx batch dequeuing

This series tries to implement rx batching for vhost-net. This is done
by batching the dequeuing from skb_array which was exported by
underlayer socket and pass the sbk back through msg_control to finish
userspace copying. This is also the requirement for more batching
implemention on rx path.

Tests shows at most 7.56% improvment bon rx pps on top of batch
zeroing and no obvious changes for TCP_STREAM/TCP_RR result.

Please review.

Thanks

Changes from V4:
- drop batch zeroing patch
- renew the performance numbers
- move skb pointer array out of vhost_net structure

Changes from V3:
- add batch zeroing patch to fix the build warnings

Changes from V2:
- rebase to net-next HEAD
- use unconsume helpers to put skb back on releasing
- introduce and use vhost_net internal buffer helpers
- renew performance numbers on top of batch zeroing

Changes from V1:
- switch to use for() in __ptr_ring_consume_batched()
- rename peek_head_len_batched() to fetch_skbs()
- use skb_array_consume_batched() instead of
  skb_array_consume_batched_bh() since no consumer run in bh
- drop the lockless peeking patch since skb_array could be resized, so
  it's not safe to call lockless one
====================
Acked-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f646c75b

vhost_net: try batch dequing from skb array · c67df11f

由 Jason Wang 提交于 5月 17, 2017

We used to dequeue one skb during recvmsg() from skb_array, this could
be inefficient because of the bad cache utilization and spinlock
touching for each packet. This patch tries to batch them by calling
batch dequeuing helpers explicitly on the exported skb array and pass
the skb back through msg_control for underlayer socket to finish the
userspace copying. Batch dequeuing is also the requirement for more
batching improvement on receive path.

Tests were done by pktgen on tap with XDP1 in guest. Host is Intel(R)
Xeon(R) CPU E5-2650 0 @ 2.00GHz.

rx batch | pps

0   2.25Mpps
1   2.33Mpps (+3.56%)
4   2.33Mpps (+3.56%)
16  2.35Mpps (+4.44%)
64  2.42Mpps (+7.56%) <- Default rx batching
128 2.40Mpps (+6.67%)
256 2.38Mpps (+5.78%)
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c67df11f

tap: support receiving skb from msg_control · 3b4ba04a

由 Jason Wang 提交于 5月 17, 2017

This patch makes tap_recvmsg() can receive from skb from its caller
through msg_control. Vhost_net will be the first user.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b4ba04a

tun: support receiving skb through msg_control · ac77cfd4

由 Jason Wang 提交于 5月 17, 2017

This patch makes tun_recvmsg() can receive from skb from its caller
through msg_control. Vhost_net will be the first user.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac77cfd4

tap: export skb_array · 49f96fd0

由 Jason Wang 提交于 5月 17, 2017

This patch exports skb_array through tap_get_skb_array(). Caller can
then manipulate skb array directly.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

49f96fd0

tun: export skb_array · 83339c6b

由 Jason Wang 提交于 5月 17, 2017

This patch exports skb_array through tun_get_skb_array(). Caller can
then manipulate skb array directly.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83339c6b

skb_array: introduce batch dequeuing · 3528c1a5

由 Jason Wang 提交于 5月 17, 2017

Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3528c1a5

ptr_ring: introduce batch dequeuing · 728fc8d5

由 Jason Wang 提交于 5月 17, 2017

This patch introduce a batched version of consuming, consumer can
dequeue more than one pointers from the ring at a time. We don't care
about the reorder of reading here so no need for compiler barrier.
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

728fc8d5

skb_array: introduce skb_array_unconsume · 3acb6960

由 Jason Wang 提交于 5月 17, 2017

Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3acb6960

ptr_ring: add ptr_ring_unconsume · 197a5212

由 Michael S. Tsirkin 提交于 5月 17, 2017

Applications that consume a batch of entries in one go
can benefit from ability to return some of them back
into the ring.

Add an API for that - assuming there's space. If there's no space
naturally can't do this and have to drop entries, but this implies ring
is full so we'd likely drop some anyway.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

197a5212

Merge branch 'phy-marvell-cleanups' · 1fc4d180

由 David S. Miller 提交于 5月 17, 2017

Andrew Lunn says:

====================
net: phy: marvell: Checkpatch cleanup

I will be contributing a few new features to the Marvell PHY driver
soon. Start by making the code mostly checkpatch clean. There should
not be any functional changes. Just comments set into the correct
format, missing blank lines, turn some comparisons around, and
refactoring to reduce indentation depth.

There is still one camel in the code, but it actually makes sense, so
leave it in piece.
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1fc4d180

net: phy: marvell: checkpatch - Fix remaining long lines · 23beb38f

由 Andrew Lunn 提交于 5月 17, 2017

Fold lines longer than 80 characters
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

23beb38f

net: phy: marvell: Add helpers to get/set page · 6427bb2d

由 Andrew Lunn 提交于 5月 17, 2017

Makes the code a bit more readable, and solves quite a few checkpatch
warnings of lines longer than 80 characters.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6427bb2d

net: phy: marvell: Refactor some bigger functions · e1dde8dc

由 Andrew Lunn 提交于 5月 17, 2017

Break big functions up by using a number of smaller helper
function. Solves some of the over 80 lines warnings, by reducing the
indentation level.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1dde8dc

net: phy: marvell: Checkpatch - assignments and comparisons · 4f48ed32

由 Andrew Lunn 提交于 5月 17, 2017

Avoid multiple assignments
Comparisons should place the constant on the right side of the test
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f48ed32

net: phy: marvell: Checkpatch - Missing or extra blank lines · e69d9ed4

由 Andrew Lunn 提交于 5月 17, 2017

Remove the extra blank lines, add one in where recommended.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e69d9ed4

net: phy: Marvell: checkpatch - Comments · 0c3439bc

由 Andrew Lunn 提交于 5月 17, 2017

Use net style comment blocks, and wrap one block with long lines.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0c3439bc

Merge branch 'tcp-TCP-TS-option-use-1-ms-clock' · e26925ec

由 David S. Miller 提交于 5月 17, 2017

Eric Dumazet says:

====================
tcp: TCP TS option use 1 ms clock

TCP Timestamps option is defined in RFC 7323

Traditionally on linux, it has been tied to the internal
'jiffy' variable, because it had been a cheap and good enough
generator.

Unfortunately some distros use HZ=250 or even HZ=100 leading
to not very useful TCP timestamps.

For TCP flows in the DC, Google has used usec resolution for more
than two years with great success [1].
RCVBUF autotuning is more precise.

This series converts tp->tcp_mstamp to a plain u64 value storing
a 1 usec TCP clock.

This choice will allow us to upstream the 1 usec TS option as
discussed in IETF 97.

Kathleen Nichols [2] and others advocate for 1ms TS clocks for
network analysis. (1ms being the lowest value supported by RFC 7323.)

[1] https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf
[2] http://netseminar.stanford.edu/seminars/02_02_17.pdf
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e26925ec

tcp: switch TCP TS option (RFC 7323) to 1ms clock · 9a568de4