提交 · 594831a8aba3fd045c3212a3e3bb9788c77b989d · openeuler / Kernel

15 1月, 2018 1 次提交

net: sch: prio: Add offload ability to PRIO qdisc · 7fdb61b4

由 Nogah Frankel 提交于 1月 14, 2018

Add the ability to offload PRIO qdisc by using ndo_setup_tc.
There are three commands for PRIO offloading:
* TC_PRIO_REPLACE: handles set and tune
* TC_PRIO_DESTROY: handles qdisc destroy
* TC_PRIO_STATS: updates the qdiscs counters (given as reference)

Like RED qdisc, the indication of whether PRIO is being offloaded is being
set and updated as part of the dump function. It is so because the driver
could decide to offload or not based on the qdisc parent, which could
change without notifying the qdisc.
Signed-off-by: NNogah Frankel <nogahf@mellanox.com>
Reviewed-by: NYuval Mintz <yuvalm@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7fdb61b4

11 1月, 2018 3 次提交

net: sch: red: Change the name of the stats struct to be generic · f34b4aac

由 Nogah Frankel 提交于 1月 10, 2018

Change the name of the stats struct to be generic, so it could be used for
other qdisc offload, that will be added in the next patches.
Signed-off-by: NNogah Frankel <nogahf@mellanox.com>
Reviewed-by: NYuval Mintz <yuvalm@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f34b4aac

ipv6: Add support for non-equal-cost multipath · 398958ae

由 Ido Schimmel 提交于 1月 09, 2018

The use of hash-threshold instead of modulo-N makes it trivial to add
support for non-equal-cost multipath.

Instead of dividing the multipath hash function's output space equally
between the nexthops, each nexthop is assigned a region size which is
proportional to its weight.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

398958ae

ipv6: Calculate hash thresholds for IPv6 nexthops · d7dedee1

由 Ido Schimmel 提交于 1月 09, 2018

Before we convert IPv6 to use hash-threshold instead of modulo-N, we
first need each nexthop to store its region boundary in the hash
function's output space.

The boundary is calculated by dividing the output space equally between
the different active nexthops. That is, nexthops that are not dead or
linkdown.

The boundaries are rebalanced whenever a nexthop is added or removed to
a multipath route and whenever a nexthop becomes active or inactive.
Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d7dedee1

09 1月, 2018 20 次提交

sctp: fix the handling of ICMP Frag Needed for too small MTUs · b6c5734d

由 Marcelo Ricardo Leitner 提交于 1月 05, 2018

syzbot reported a hang involving SCTP, on which it kept flooding dmesg
with the message:
[  246.742374] sctp: sctp_transport_update_pmtu: Reported pmtu 508 too
low, using default minimum of 512

That happened because whenever SCTP hits an ICMP Frag Needed, it tries
to adjust to the new MTU and triggers an immediate retransmission. But
it didn't consider the fact that MTUs smaller than the SCTP minimum MTU
allowed (512) would not cause the PMTU to change, and issued the
retransmission anyway (thus leading to another ICMP Frag Needed, and so
on).

As IPv4 (ip_rt_min_pmtu=556) and IPv6 (IPV6_MIN_MTU=1280) minimum MTU
are higher than that, sctp_transport_update_pmtu() is changed to
re-fetch the PMTU that got set after our request, and with that, detect
if there was an actual change or not.

The fix, thus, skips the immediate retransmission if the received ICMP
resulted in no change, in the hope that SCTP will select another path.

Note: The value being used for the minimum MTU (512,
SCTP_DEFAULT_MINSEGMENT) is not right and instead it should be (576,
SCTP_MIN_PMTU), but such change belongs to another patch.

Changes from v1:
- do not disable PMTU discovery, in the light of commit
06ad3919 ("[SCTP] Don't disable PMTU discovery when mtu is small")
and as suggested by Xin Long.
- changed the way to break the rtx loop by detecting if the icmp
  resulted in a change or not
Changes from v2:
none

See-also: https://lkml.org/lkml/2017/12/22/811Reported-by: Nsyzbot <syzkaller@googlegroups.com>
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b6c5734d

net: ipv6: Allow connect to linklocal address from socket bound to vrf · 54dc3e33

由 David Ahern 提交于 1月 04, 2018

Allow a process bound to a VRF to connect to a linklocal address.
Currently, this fails because of a mismatch between the scope of the
linklocal address and the sk_bound_dev_if inherited by the VRF binding:
    $ ssh -6 fe80::70b8:cff:fedd:ead8%eth1
    ssh: connect to host fe80::70b8:cff:fedd:ead8%eth1 port 22: Invalid argument

Relax the scope check to allow the socket to be bound to the same L3
device as the scope id.

This makes ipv6 linklocal consistent with other relaxed checks enabled
by commits 1ff23bee ("net: l3mdev: Allow send on enslaved interface")
and 7bb387c5 ("net: Allow IP_MULTICAST_IF to set index to L3 slave").
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

54dc3e33

netfilter: flow table support for the mixed IPv4/IPv6 family · 7c23b629

由 Pablo Neira Ayuso 提交于 1月 07, 2018

This patch adds the IPv6 flow table type, that implements the datapath
flow table to forward IPv6 traffic.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

7c23b629

netfilter: flow table support for IPv6 · 09952107

由 Pablo Neira Ayuso 提交于 1月 07, 2018

This patch adds the IPv6 flow table type, that implements the datapath
flow table to forward IPv6 traffic.

This patch exports ip6_dst_mtu_forward() that is required to check for
mtu to pass up packets that need PMTUD handling to the classic
forwarding path.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

09952107

netfilter: add generic flow table infrastructure · ac2a6666

由 Pablo Neira Ayuso 提交于 1月 07, 2018

This patch defines the API to interact with flow tables, this allows to
add, delete and lookup for entries in the flow table. This also adds the
generic garbage code that removes entries that have expired, ie. no
traffic has been seen for a while.

Users of the flow table infrastructure can delete entries via
flow_offload_dead(), which sets the dying bit, this signals the garbage
collector to release an entry from user context.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

ac2a6666

netfilter: nf_tables: add flow table netlink frontend · 3b49e2e9

由 Pablo Neira Ayuso 提交于 1月 07, 2018

This patch introduces a netlink control plane to create, delete and dump
flow tables. Flow tables are identified by name, this name is used from
rules to refer to an specific flow table. Flow tables use the rhashtable
class and a generic garbage collector to remove expired entries.

This also adds the infrastructure to add different flow table types, so
we can add one for each layer 3 protocol family.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

3b49e2e9

netfilter: nf_tables: remove nft_dereference() · 0befd061

由 Pablo Neira Ayuso 提交于 1月 02, 2018

This macro is unnecessary, it just hides details for one single caller.
nfnl_dereference() is just enough.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

0befd061

netfilter: connlimit: split xt_connlimit into front and backend · 625c5561

由 Florian Westphal 提交于 12月 09, 2017

This allows to reuse xt_connlimit infrastructure from nf_tables.
The upcoming nf_tables frontend can just pass in an nftables register
as input key, this allows limiting by any nft-supported key, including
concatenations.

For xt_connlimit, pass in the zone and the ip/ipv6 address.

With help from Yi-Hung Wei.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Acked-by: NYi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

625c5561

netfilter: nf_tables: remove hooks from family definition · c2f9eafe

由 Pablo Neira Ayuso 提交于 12月 09, 2017

They don't belong to the family definition, move them to the filter
chain type definition instead.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c2f9eafe

netfilter: nf_tables: remove multihook chains and families · c974a3a3

由 Pablo Neira Ayuso 提交于 12月 09, 2017

Since NFPROTO_INET is handled from the core, we don't need to maintain
extra infrastructure in nf_tables to handle the double hook
registration, one for IPv4 and another for IPv6.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c974a3a3

netfilter: nf_tables_inet: don't use multihook infrastructure anymore · 12355d36

由 Pablo Neira Ayuso 提交于 12月 09, 2017

Use new native NFPROTO_INET support in netfilter core, this gets rid of
ad-hoc code in the nf_tables API codebase.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

12355d36

P
netfilter: nf_tables: add nft_set_is_anonymous() helper · 408070d6
由 Pablo Neira Ayuso 提交于 11月 24, 2017
```
Add helper function to test for the NFT_SET_ANONYMOUS flag.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
```
408070d6

netfilter: nf_tables: explicit nft_set_pktinfo() call from hook path · 7a4473a3

由 Pablo Neira Ayuso 提交于 12月 10, 2017

Instead of calling this function from the family specific variant, this
reduces the code size in the fast path for the netdev, bridge and inet
families. After this change, we must call nft_set_pktinfo() upfront from
the chain hook indirection.

Before:

   text    data     bss     dec     hex filename
   2145     208       0    2353     931 net/netfilter/nf_tables_netdev.o

After:

   text    data     bss     dec     hex filename
   2125     208       0    2333     91d net/netfilter/nf_tables_netdev.o
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

7a4473a3

netfilter: don't allocate space for arp/bridge hooks unless needed · 2a95183a

由 Florian Westphal 提交于 12月 07, 2017

no need to define hook points if the family isn't supported.
Because we need these hooks for either nftables, arp/ebtables
or the 'call-iptables' hack we have in the bridge layer add two
new dependencies, NETFILTER_FAMILY_{ARP,BRIDGE}, and have the
users select them.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

2a95183a

netfilter: don't allocate space for decnet hooks unless needed · bb4badf3

由 Florian Westphal 提交于 12月 07, 2017

no need to define hook points if the family isn't supported.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

bb4badf3

netfilter: reduce hook array sizes to what is needed · ef57170b

由 Florian Westphal 提交于 12月 07, 2017

Not all families share the same hook count, adjust sizes to what is
needed.

struct net before:
/* size: 6592, cachelines: 103, members: 46 */
after:
/* size: 5952, cachelines: 93, members: 46 */
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

ef57170b

netfilter: reduce size of hook entry point locations · b0f38338

由 Florian Westphal 提交于 12月 03, 2017

struct net contains:

struct nf_hook_entries __rcu *hooks[NFPROTO_NUMPROTO][NF_MAX_HOOKS];

which store the hook entry point locations for the various protocol
families and the hooks.

Using array results in compact c code when doing accesses, i.e.
  x = rcu_dereference(net->nf.hooks[pf][hook]);

but its also wasting a lot of memory, as most families are
not used.

So split the array into those families that are used, which
are only 5 (instead of 13).  In most cases, the 'pf' argument is
constant, i.e. gcc removes switch statement.

struct net before:
 /* size: 5184, cachelines: 81, members: 46 */
after:
 /* size: 4672, cachelines: 73, members: 46 */
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

b0f38338

netfilter: core: remove synchronize_net call if nfqueue is used · 26888dfd

由 Florian Westphal 提交于 12月 01, 2017

since commit 960632ec ("netfilter: convert hook list to an array")
nfqueue no longer stores a pointer to the hook that caused the packet
to be queued.  Therefore no extra synchronize_net() call is needed after
dropping the packets enqueued by the old rule blob.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

26888dfd

netfilter: ipvs: Remove useless ipvsh param of frag_safe_skb_hp · 6b3d9330

由 Gao Feng 提交于 11月 13, 2017

The param of frag_safe_skb_hp, ipvsh, isn't used now. So remove it and
update the callers' codes too.
Signed-off-by: NGao Feng <gfree.wind@vip.163.com>
Acked-by: NSimon Horman <horms+renesas@verge.net.au>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

6b3d9330

netfilter: conntrack: l4 protocol trackers can be const · 9dae47ab

由 Florian Westphal 提交于 11月 07, 2017

previous patches removed all writes to these structs so we can
now mark them as const.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

9dae47ab

08 1月, 2018 7 次提交

F
netfilter: conntrack: constify list of builtin trackers · cd9ceafc
由 Florian Westphal 提交于 11月 07, 2017
```
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
```
cd9ceafc

netfilter: conntrack: remove nlattr_size pointer from l4proto trackers · 39215846

由 Florian Westphal 提交于 11月 07, 2017

similar to previous commit, but instead compute this at compile time
and turn nlattr_size into an u16.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

39215846

ipv6: Export sernum update function · 4a8e56ee