提交 · 713bafea92920103cd3d361657406cf04d0e22dd · openanolis / cloud-kernel

11 11月, 2017 5 次提交

tcp: retire FACK loss detection · 713bafea

由 Yuchung Cheng 提交于 11月 08, 2017

FACK loss detection has been disabled by default and the
successor RACK subsumed FACK and can handle reordering better.
This patch removes FACK to simplify TCP loss recovery.
Signed-off-by: NYuchung Cheng <ycheng@google.com>
Reviewed-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NNeal Cardwell <ncardwell@google.com>
Reviewed-by: NSoheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: NPriyaranjan Jha <priyarjha@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

713bafea

tipc: improve link resiliency when rps is activated · 8d6e79d3

由 Jon Maloy 提交于 11月 08, 2017

Currently, the TIPC RPS dissector is based only on the incoming packets'
source node address, hence steering all traffic from a node to the same
core. We have seen that this makes the links vulnerable to starvation
and unnecessary resets when we turn down the link tolerance to very low
values.

To reduce the risk of this happening, we exempt probe and probe replies
packets from the convergence to one core per source node. Instead, we do
the opposite, - we try to diverge those packets across as many cores as
possible, by randomizing the flow selector key.

To make such packets identifiable to the dissector, we add a new
'is_keepalive' bit to word 0 of the LINK_PROTOCOL header. This bit is
set both for PROBE and PROBE_REPLY messages, and only for those.

It should be noted that these packets are not part of any flow anyway,
and only constitute a minuscule fraction of all packets sent across a
link. Hence, there is no risk that this will affect overall performance.
Acked-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d6e79d3

net: ipv6: sysctl to specify IPv6 ND traffic class · 2210d6b2

由 Maciej Żenczykowski 提交于 11月 07, 2017

Add a per-device sysctl to specify the default traffic class to use for
kernel originated IPv6 Neighbour Discovery packets.

Currently this includes:

  - Router Solicitation (ICMPv6 type 133)
    ndisc_send_rs() -> ndisc_send_skb() -> ip6_nd_hdr()

  - Neighbour Solicitation (ICMPv6 type 135)
    ndisc_send_ns() -> ndisc_send_skb() -> ip6_nd_hdr()

  - Neighbour Advertisement (ICMPv6 type 136)
    ndisc_send_na() -> ndisc_send_skb() -> ip6_nd_hdr()

  - Redirect (ICMPv6 type 137)
    ndisc_send_redirect() -> ndisc_send_skb() -> ip6_nd_hdr()

and if the kernel ever gets around to generating RA's,
it would presumably also include:

  - Router Advertisement (ICMPv6 type 134)
    (radvd daemon could pick up on the kernel setting and use it)

Interface drivers may examine the Traffic Class value and translate
the DiffServ Code Point into a link-layer appropriate traffic
prioritization scheme.  An example of mapping IETF DSCP values to
IEEE 802.11 User Priority values can be found here:

    https://tools.ietf.org/html/draft-ietf-tsvwg-ieee-802-11

The expected primary use case is to properly prioritize ND over wifi.

Testing:
  jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  0
  jzem22:~# echo -1 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  -bash: echo: write error: Invalid argument
  jzem22:~# echo 256 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  -bash: echo: write error: Invalid argument
  jzem22:~# echo 0 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# echo 255 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  255
  jzem22:~# echo 34 > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# cat /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  34

  jzem22:~# echo $[0xDC] > /proc/sys/net/ipv6/conf/eth0/ndisc_tclass
  jzem22:~# tcpdump -v -i eth0 icmp6 and src host jzem22.pgc and dst host fe80::1
  tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
  IP6 (class 0xdc, hlim 255, next-header ICMPv6 (58) payload length: 24)
  jzem22.pgc > fe80::1: [icmp6 sum ok] ICMP6, neighbor advertisement,
  length 24, tgt is jzem22.pgc, Flags [solicited]

(based on original change written by Erik Kline, with minor changes)

v2: fix 'suspicious rcu_dereference_check() usage'
    by explicitly grabbing the rcu_read_lock.

Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: NErik Kline <ek@google.com>
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2210d6b2

net/ncsi: Don't return error on normal response · 04bad8bd

由 Samuel Mendoza-Jonas 提交于 11月 08, 2017

Several response handlers return EBUSY if the data corresponding to the
command/response pair is already set. There is no reason to return an
error here; the channel is advertising something as enabled because we
told it to enable it, and it's possible that the feature has been
enabled previously.
Signed-off-by: NSamuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04bad8bd

net/ncsi: Improve general state logging · 9ef8690b

由 Samuel Mendoza-Jonas 提交于 11月 08, 2017

The NCSI driver is mostly silent which becomes a headache when trying to
determine what has occurred on the NCSI connection. This adds additional
logging in a few key areas such as state transitions and calling out
certain errors more visibly.
Signed-off-by: NSamuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9ef8690b

10 11月, 2017 13 次提交

act_vlan: VLAN action rewrite to use RCU lock/unlock and update · 4c5b9d96

由 Manish Kurup 提交于 11月 07, 2017

Using a spinlock in the VLAN action causes performance issues when the VLAN
action is used on multiple cores. Rewrote the VLAN action to use RCU read
locking for reads and updates instead.
All functions now use an RCU dereferenced pointer to access the VLAN action
context. Modified helper functions used by other modules, to use the RCU as
opposed to directly accessing the structure.
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NManish Kurup <manish.kurup@verizon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c5b9d96

act_vlan: Change stats update to use per-core stats · e0496cbb

由 Manish Kurup 提交于 11月 07, 2017

The VLAN action maintains one set of stats across all cores, and uses a
spinlock to synchronize updates to it from the same. Changed this to use a
per-CPU stats context instead.
This change will result in better performance.
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NManish Kurup <manish.kurup@verizon.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0496cbb

ip_gre: add the support for i/o_flags update via ioctl · a0efab67

由 Xin Long 提交于 11月 07, 2017

As patch 'ip_gre: add the support for i/o_flags update via netlink'
did for netlink, we also need to do the same job for these update
via ioctl.

This patch is to update i/o_flags and call ipgre_link_update to
recalculate these gre properties after ip_tunnel_ioctl does the
common update.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NWilliam Tu <u9012063@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a0efab67

ip_gre: add the support for i/o_flags update via netlink · dd9d598c

由 Xin Long 提交于 11月 07, 2017

Now ip_gre is using ip_tunnel_changelink to update it's properties, but
ip_tunnel_changelink in ip_tunnel doesn't update i/o_flags as a common
function.

o_flags updates would cause that tunnel->tun_hlen / hlen and dev->mtu /
needed_headroom need to be recalculated, and dev->(hw_)features need to
be updated as well.

Therefore, we can't just add the update into ip_tunnel_update called
in ip_tunnel_changelink, and it's also better not to touch ip_tunnel
codes.

This patch updates i/o_flags and calls ipgre_link_update to recalculate
these gre properties after ip_tunnel_changelink does the common update.

Note that since ipgre_link_update doesn't know the lower dev, it will
update gre->hlen, dev->mtu and dev->needed_headroom with the value of
'new tun_hlen - old tun_hlen'. In this way, we can avoid many redundant
codes, unlike ip6_gre.
Reported-by: NJianlin Shi <jishi@redhat.com>
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Acked-by: NWilliam Tu <u9012063@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd9d598c

tcp: Namespace-ify sysctl_tcp_rmem and sysctl_tcp_wmem · 356d1833

由 Eric Dumazet 提交于 11月 07, 2017

Note that when a new netns is created, it inherits its
sysctl_tcp_rmem and sysctl_tcp_wmem from initial netns.

This change is needed so that we can refine TCP rcvbuf autotuning,
to take RTT into consideration.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

356d1833

net: allow per netns sysctl_rmem and sysctl_wmem for protos · a3dcaf17

由 Eric Dumazet 提交于 11月 07, 2017

As we want to gradually implement per netns sysctl_rmem and sysctl_wmem
on per protocol basis, add two new fields in struct proto,
and two new helpers : sk_get_wmem0() and sk_get_rmem0()

First user will be TCP. Then UDP and SCTP can be easily converted,
while DECNET probably wont get this support.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3dcaf17

net: dsa: Don't add vlans when vlan filtering is disabled · 2ea7a679

由 Andrew Lunn 提交于 11月 07, 2017

The software bridge can be build with vlan filtering support
included. However, by default it is turned off. In its turned off
state, it still passes VLANs via switchev, even though they are not to
be used. Don't pass these VLANs to the hardware. Only do so when vlan
filtering is enabled.

This fixes at least one corner case. There are still issues in other
corners, such as when vlan_filtering is later enabled.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2ea7a679

net: dsa: switch: Don't add CPU port to an mdb by default · ae45102c