提交 · 8fc54f68919298ff9689d980efb495707ef43f30 · openeuler / Kernel

24 8月, 2014 2 次提交

net: use reciprocal_scale() helper · 8fc54f68

由 Daniel Borkmann 提交于 8月 23, 2014

Replace open codings of (((u64) <x> * <y>) >> 32) with reciprocal_scale().
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8fc54f68

net: Allow raw buffers to be passed into the flow dissector. · 690e36e7

由 David S. Miller 提交于 8月 23, 2014

Drivers, and perhaps other entities we have not yet considered,
sometimes want to know how deep the protocol headers go before
deciding how large of an SKB to allocate and how much of the packet to
place into the linear SKB area.

For example, consider a driver which has a device which DMAs into
pools of pages and then tells the driver where the data went in the
DMA descriptor(s).  The driver can then build an SKB and reference
most of the data via SKB fragments (which are page/offset/length
triplets).

However at least some of the front of the packet should be placed into
the linear SKB area, which comes before the fragments, so that packet
processing can get at the headers efficiently.  The first thing each
protocol layer is going to do is a "pskb_may_pull()" so we might as
well aggregate as much of this as possible while we're building the
SKB in the driver.

Part of supporting this is that we don't have an SKB yet, so we want
to be able to let the flow dissector operate on a raw buffer in order
to compute the offset of the end of the headers.

So now we have a __skb_flow_dissect() which takes an explicit data
pointer and length.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

690e36e7

08 7月, 2014 4 次提交

net: Only do flow_dissector hash computation once per packet · a3b18ddb

由 Tom Herbert 提交于 7月 01, 2014

Add sw_hash flag to skbuff to indicate that skb->hash was computed
from flow_dissector. This flag is checked in skb_get_hash to avoid
repeatedly trying to compute the hash (ie. in the case that no L4 hash
can be computed).
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3b18ddb

flow_dissector: Use IPv6 flow label in flow_dissector · 19469a87

由 Tom Herbert 提交于 7月 01, 2014

This patch implements the receive side to support RFC 6438 which is to
use the flow label as an ECMP hash. If an IPv6 flow label is set
in a packet we can use this as input for computing an L4-hash. There
should be no need to parse any transport headers in this case.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

19469a87

net: Call skb_get_hash in get_xps_queue and __skb_tx_hash · 0e001614

由 Tom Herbert 提交于 7月 01, 2014

Call standard function to get a packet hash instead of taking this from
skb->sk->sk_hash or only using skb->protocol.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e001614

flow_dissector: Abstract out hash computation · 5ed20a68

由 Tom Herbert 提交于 7月 01, 2014

Move the hash computation located in __skb_get_hash to be a separate
function which takes flow_keys as input. This will allow flow hash
computation in other contexts where we only have addresses and ports.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ed20a68

24 6月, 2014 1 次提交

flow_keys: Record IP layer protocol in skb_flow_dissect() · e0f31d84

由 Govindarajulu Varadarajan 提交于 6月 23, 2014

skb_flow_dissect() dissects only transport header type in ip_proto. It dose not
give any information about IPv4 or IPv6.

This patch adds new member, n_proto, to struct flow_keys. Which records the
IP layer type. i.e IPv4 or IPv6.

This can be used in netdev->ndo_rx_flow_steer driver function to dissect flow.

Adding new member to flow_keys increases the struct size by around 4 bytes.
This causes BUILD_BUG_ON(sizeof(qcb->data) < sz); to fail in
qdisc_cb_private_validate()

So increase data size by 4
Signed-off-by: NGovindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0f31d84

27 3月, 2014 1 次提交

net: Rename skb->rxhash to skb->hash · 61b905da

由 Tom Herbert 提交于 3月 24, 2014

The packet hash can be considered a property of the packet, not just
on RX path.

This patch changes name of rxhash and l4_rxhash skbuff fields to be
hash and l4_hash respectively. This includes changing uses of the
field in the code which don't call the access functions.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

61b905da

13 3月, 2014 1 次提交

net: Convert uses of __constant_<foo> to <foo> · 2b8837ae

由 Joe Perches 提交于 3月 12, 2014

The use of __constant_<foo> has been unnecessary for quite awhile now.

Make these uses consistent with the rest of the kernel.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2b8837ae

17 2月, 2014 2 次提交

netdevice: move netdev_cap_txqueue for shared usage to header · b9507bda

由 Daniel Borkmann 提交于 2月 16, 2014

In order to allow users to invoke netdev_cap_txqueue, it needs to
be moved into netdevice.h header file. While at it, also add kernel
doc header to document the API.
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b9507bda

netdevice: add queue selection fallback handler for ndo_select_queue · 99932d4f

由 Daniel Borkmann 提交于 2月 16, 2014

Add a new argument for ndo_select_queue() callback that passes a
fallback handler. This gets invoked through netdev_pick_tx();
fallback handler is currently __netdev_pick_tx() as most drivers
invoke this function within their customized implementation in
case for skbs that don't need any special handling. This fallback
handler can then be replaced on other call-sites with different
queue selection methods (e.g. in packet sockets, pktgen etc).

This also has the nice side-effect that __netdev_pick_tx() is
then only invoked from netdev_pick_tx() and export of that
function to modules can be undone.
Suggested-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

99932d4f

11 1月, 2014 1 次提交

net: core: explicitly select a txq before doing l2 forwarding · f663dd9a

由 Jason Wang 提交于 1月 10, 2014

Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
will cause several issues:

- NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
  instead of lower device which misses the necessary txq synchronization for
  lower device such as txq stopping or frozen required by dev watchdog or
  control path.
- dev_hard_start_xmit() was called with NULL txq which bypasses the net device
  watchdog.
- dev_hard_start_xmit() does not check txq everywhere which will lead a crash
  when tso is disabled for lower device.

Fix this by explicitly introducing a new param for .ndo_select_queue() for just
selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
forwarding transmission.

With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
dev_queue_xmit() to do the transmission.

In the future, it was also required for macvtap l2 forwarding support since it
provides a necessary synchronization method.

Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: e1000-devel@lists.sourceforge.net
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f663dd9a

18 12月, 2013 1 次提交

net: Change skb_get_rxhash to skb_get_hash · 3958afa1

由 Tom Herbert 提交于 12月 15, 2013

Changing name of function as part of making the hash in skbuff to be
generic property, not just for receive path.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3958afa1

09 11月, 2013 1 次提交

net: flow_dissector: small optimizations in IPv4 dissect · 3797d3e8

由 Eric Dumazet 提交于 11月 07, 2013

By moving code around, we avoid :

1) A reload of iph->ihl (bit field, so needs a mask)

2) A conditional test (replaced by a conditional mov on x86)
   Fast path loads iph->protocol anyway.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3797d3e8

02 11月, 2013 1 次提交

net: flow_dissector: fail on evil iph->ihl · 6f092343

由 Jason Wang 提交于 11月 01, 2013

We don't validate iph->ihl which may lead a dead loop if we meet a IPIP
skb whose iph->ihl is zero. Fix this by failing immediately when iph->ihl
is evil (less than 5).

This issue were introduced by commit ec5efe79
(rps: support IPIP encapsulation).

Cc: Eric Dumazet <edumazet@google.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: NJason Wang <jasowang@redhat.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6f092343

26 10月, 2013 1 次提交

net: initialize hashrnd in flow_dissector with net_get_random_once · 66415cf8

由 Hannes Frederic Sowa 提交于 10月 23, 2013

We also can defer the initialization of hashrnd in flow_dissector
to its first use. Since net_get_random_once is irq safe now we don't
have to audit the call paths if one of this functions get called by an
interrupt handler.

Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66415cf8

04 10月, 2013 1 次提交

flow_dissector: factor out the ports extraction in skb_flow_get_ports · 357afe9c

由 Nikolay Aleksandrov 提交于 10月 02, 2013

Factor out the code that extracts the ports from skb_flow_dissect and
add a new function skb_flow_get_ports which can be re-used.
Suggested-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NNikolay Aleksandrov <nikolay@redhat.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NVeaceslav Falico <vfalico@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

357afe9c

01 10月, 2013 1 次提交

net: flow_dissector: fix thoff for IPPROTO_AH · b8678358

由 Eric Dumazet 提交于 9月 26, 2013

In commit 8ed78166 ("flow_keys: include thoff into flow_keys for
later usage"), we missed that existing code was using nhoff as a
temporary variable that could not always contain transport header
offset.

This is not a problem for TCP/UDP because port offset (@poff)
is 0 for these protocols.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: NNikolay Aleksandrov <nikolay@redhat.com>
Acked-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8678358

12 9月, 2013 1 次提交

net: fix multiqueue selection · 50d1784e

由 Eric Dumazet 提交于 9月 07, 2013

commit 416186fb ("net: Split core bits of netdev_pick_tx
into __netdev_pick_tx") added a bug that disables caching of queue
index in the socket.

This is the source of packet reorders for TCP flows, and
again this is happening more often when using FQ pacing.

Old code was doing

if (queue_index != old_index)
	sk_tx_queue_set(sk, queue_index);

Alexander renamed the variables but forgot to change sk_tx_queue_set()
2nd parameter.

if (queue_index != new_index)
	sk_tx_queue_set(sk, queue_index);

This means we store -1 over and over in sk->sk_tx_queue_mapping
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: NAlexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

50d1784e

31 8月, 2013 1 次提交

net: revert ("net: dev_pick_tx() fix") · 702821f4

由 Eric Dumazet 提交于 8月 28, 2013

commit 8728c544 ("net: dev_pick_tx() fix") and commit
b6fe83e9 ("bonding: refine IFF_XMIT_DST_RELEASE capability")
are quite incompatible : Queue selection is disabled because skb
dst was dropped before entering bonding device.

This causes major performance regression, mainly because TCP packets
for a given flow can be sent to multiple queues.

This is particularly visible when using the new FQ packet scheduler
with MQ + FQ setup on the slaves.

We can safely revert the first commit now that 416186fb
("net: Split core bits of netdev_pick_tx into __netdev_pick_tx")
properly caps the queue_index.
Reported-by: NXi Wang <xii@google.com>
Diagnosed-by: NXi Wang <xii@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Denys Fedorysychenko <nuclearcat@nuclearcat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

702821f4

10 8月, 2013 1 次提交

net: flow_dissector: add 802.1ad support · e11aada3

由 Eric Dumazet 提交于 8月 06, 2013

Same behavior than 802.1q : finds the encapsulated protocol and
skip 32bit header.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e11aada3

31 7月, 2013 2 次提交

flow_dissector: add support for IPPROTO_IPV6 · b438f940

由 Tom Herbert 提交于 7月 29, 2013

Support IPPROTO_IPV6 similar to IPPROTO_IPIP
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b438f940

flow_dissector: clean up IPIP case · fca41895

由 Tom Herbert 提交于 7月 29, 2013

Explicitly set proto to ETH_P_IP and jump directly to ip processing.
Signed-off-by: NTom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fca41895

21 3月, 2013 2 次提交

net: flow_dissector: add __skb_get_poff to get a start offset to payload · f77668dc

由 Daniel Borkmann 提交于 3月 19, 2013

__skb_get_poff() returns the offset to the payload as far as it could
be dissected. The main user is currently BPF, so that we can dynamically
truncate packets without needing to push actual payload to the user
space and instead can analyze headers only.
Suggested-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f77668dc

flow_keys: include thoff into flow_keys for later usage · 8ed78166

由 Daniel Borkmann 提交于 3月 19, 2013

In skb_flow_dissect(), we perform a dissection of a skbuff. Since we're
doing the work here anyway, also store thoff for a later usage, e.g. in
the BPF filter.
Suggested-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8ed78166

12 3月, 2013 1 次提交

flow_dissector: support L2 GRE · e1733de2

由 Michael Dalton 提交于 3月 11, 2013

Add support for L2 GRE tunnels, so that RPS can be more effective.
Signed-off-by: NMichael Dalton <mwdalton@google.com>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e1733de2

22 1月, 2013 1 次提交

net: move rx and tx hash functions to net/core/flow_dissector.c · 441d9d32

由 Cong Wang 提交于 1月 21, 2013

__skb_tx_hash() and __skb_get_rxhash() are all for calculating hash
value based by some fields in skb, mostly used for selecting queues
by device drivers.

Meanwhile, net/core/dev.c is bloating.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

441d9d32

19 7月, 2012 1 次提交

ipv6: add ipv6_addr_hash() helper · ddbe5032

由 Eric Dumazet 提交于 7月 18, 2012

Introduce ipv6_addr_hash() helper doing a XOR on all bits
of an IPv6 address, with an optimized x86_64 version.

Use it in flow dissector, as suggested by Andrew McGregor,
to reduce hash collision probabilities in fq_codel (and other
users of flow dissector)

Use it in ip6_tunnel.c and use more bit shuffling, as suggested
by David Laight, as existing hash was ignoring most of them.

Use it in sunrpc and use more bit shuffling, using hash_32().

Use it in net/ipv6/addrconf.c, using hash_32() as well.

As a cleanup, use it in net/ipv4/tcp_metrics.c
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NAndrew McGregor <andrewmcgr@gmail.com>
Cc: Dave Taht <dave.taht@gmail.com>
Cc: Tom Herbert <therbert@google.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ddbe5032

25 1月, 2012 1 次提交

net: flow_dissector.c missing include linux/export.h · c452ed70

由 Jesper Dangaard Brouer 提交于 1月 24, 2012

The file net/core/flow_dissector.c seems to be missing
including linux/export.h.
Signed-off-by: NJesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c452ed70

30 11月, 2011 1 次提交

flow_dissector: use a 64bit load/store · 4d77d2b5

由 Eric Dumazet 提交于 11月 28, 2011

Le lundi 28 novembre 2011 à 19:06 -0500, David Miller a écrit :
> From: Dimitris Michailidis <dm@chelsio.com>
> Date: Mon, 28 Nov 2011 08:25:39 -0800
>
> >> +bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys
> >> *flow)
> >> +{
> >> +	int poff, nhoff = skb_network_offset(skb);
> >> +	u8 ip_proto;
> >> +	u16 proto = skb->protocol;
> >
> > __be16 instead of u16 for proto?
>
> I'll take care of this when I apply these patches.

( CC trimmed )

Thanks David !

Here is a small patch to use one 64bit load/store on x86_64 instead of
two 32bit load/stores.

[PATCH net-next] flow_dissector: use a 64bit load/store

gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flow_keys (src,dst) fields wont
break the rule.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d77d2b5

29 11月, 2011 1 次提交

net: introduce skb_flow_dissect() · 0744dd00

由 Eric Dumazet 提交于 11月 28, 2011

We use at least two flow dissectors in network stack, with known
limitations and code duplication.

Introduce skb_flow_dissect() to factorize this, highly inspired from
existing dissector from __skb_get_rxhash()

Note : We extensively use skb_header_pointer(), this permits us to not
touch skb at all.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0744dd00

openeuler / Kernel 11 个月 前同步成功

openeuler / Kernel
11 个月前同步成功