提交 · e5276937ae6e654a811345f0716266f12e77bede · openanolis / cloud-kernel

02 9月, 2015 2 次提交

flow_dissector: Move skb related functions to skbuff.h · e5276937

由 Tom Herbert 提交于 9月 01, 2015

Move the flow dissector functions that are specific to skbuffs into
skbuff.h out of flow_dissector.h. This makes flow_dissector.h have
no dependencies on skbuff.h.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5276937

net: Make table id type u32 · 9b8ff518

由 David Ahern 提交于 9月 01, 2015

A number of VRF patches used 'int' for table id. It should be u32 to be
consistent with the rest of the stack.

Fixes:
4e3c8992 ("net: Introduce VRF related flags and helpers")
15be405e ("net: Add inet_addr lookup by table")
30bbaa19 ("net: Fix up inet_addr_type checks")
021dd3b8 ("net: Add routes to the table associated with the device")
dc028da5 ("inet: Move VRF table lookup to inlined function")
f6d3c192 ("net: FIB tracepoints")
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9b8ff518

01 9月, 2015 8 次提交

tun_dst: Remove opts_size · 63b6c13d

由 Pravin B Shelar 提交于 8月 31, 2015

opts_size is only written and never read. Following patch
removes this unused variable.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

63b6c13d

gro_cells: remove spinlock protecting receive queues · c42858ea

由 Eric Dumazet 提交于 8月 31, 2015

As David pointed out, spinlock are no longer needed
to protect the per cpu queues used in gro cells infrastructure.

Also use new napi_complete_done() API so that gro_flush_timeout
tweaks have an effect.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c42858ea

phy: fixed_phy: Add gpio to determine link up/down. · a5597008

由 Andrew Lunn 提交于 8月 31, 2015

An SFP module may have a link up/down status pin which can be
connection to a GPIO line of the host. Add support for reading such an
GPIO in the fixed_phy driver.
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5597008

net: phy: Allow PHY devices to identify themselves as Ethernet switches, etc. · 5a11dd7d

由 Florian Fainelli 提交于 8月 31, 2015

Some Ethernet MAC drivers using the PHY library require the hardcoding
of link parameters when interfaced to a switch device, SFP module,
switch to switch port, etc. This has typically lead to various ad-hoc
implementations looking like this:

- using a "fixed PHY" emulated device, which will provide link
  indication towards the Ethernet MAC driver and hardware

- pretend there is no PHY and hardcode link parameters, ala mv643x_eth

Based on that, it is desireable to have the PHY drivers advertise the
correct link parameters, just like regular Ethernet PHYs towards their
CPU Ethernet MAC drivers, however, Ethernet MAC drivers should be able
to tell whether this link should be monitored or not. In the context
of an Ethernet switch, SFP module, switch to switch link, we do not
need to monitor this link since it should be always up.
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5a11dd7d

net: Add tos to validate source tracepoint · f0fa6e52

由 David Ahern 提交于 8月 31, 2015

TOS is another key aspect of the lookup passed to fib_validate_source.
Add it to the tracepoint.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0fa6e52

tcp: use dctcp if enabled on the route to the initiator · c3a8d947

由 Daniel Borkmann 提交于 8月 31, 2015

Currently, the following case doesn't use DCTCP, even if it should:
A responder has f.e. Cubic as system wide default, but for a specific
route to the initiating host, DCTCP is being set in RTAX_CC_ALGO. The
initiating host then uses DCTCP as congestion control, but since the
initiator sets ECT(0), tcp_ecn_create_request() doesn't set ecn_ok,
and we have to fall back to Reno after 3WHS completes.

We were thinking on how to solve this in a minimal, non-intrusive
way without bloating tcp_ecn_create_request() needlessly: lets cache
the CA ecn option flag in RTAX_FEATURES. In other words, when ECT(0)
is set on the SYN packet, set ecn_ok=1 iff route RTAX_FEATURES
contains the unexposed (internal-only) DST_FEATURE_ECN_CA. This allows
to only do a single metric feature lookup inside tcp_ecn_create_request().

Joint work with Florian Westphal.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c3a8d947

fib, fib6: reject invalid feature bits · b8d3e416

由 Daniel Borkmann 提交于 8月 31, 2015

Feature bits that are invalid should not be accepted by the kernel,
only the lower 4 bits may be configured, but not the remaining ones.
Even from these 4, 2 of them are unused.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8d3e416

ip-tunnel: Use API to access tunnel metadata options. · 4c222798

由 Pravin B Shelar 提交于 8月 30, 2015

Currently tun-info options pointer is used in few cases to
pass options around. But tunnel options can be accessed using
ip_tunnel_info_opts() API without using the pointer. Following
patch removes the redundant pointer and consistently make use
of API.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Reviewed-by: NJesse Gross <jesse@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c222798

31 8月, 2015 1 次提交

net: Introduce helper functions to get the per cpu data · c4c6bc31

由 Raghavendra K T 提交于 8月 30, 2015

Signed-off-by: NRaghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c4c6bc31

30 8月, 2015 4 次提交

vxlan: do not receive IPv4 packets on IPv6 socket · a43a9ef6

由 Jiri Benc 提交于 8月 28, 2015

By default (subject to the sysctl settings), IPv6 sockets listen also for
IPv4 traffic. Vxlan is not prepared for that and expects IPv6 header in
packets received through an IPv6 socket.

In addition, it's currently not possible to have both IPv4 and IPv6 vxlan
tunnel on the same port (unless bindv6only sysctl is enabled), as it's not
possible to create and bind both IPv4 and IPv6 vxlan interfaces and there's
no way to specify both IPv4 and IPv6 remote/group IP addresses.

Set IPV6_V6ONLY on vxlan sockets to fix both of these issues. This is not
done globally in udp_tunnel, as l2tp and tipc seems to work okay when
receiving IPv4 packets on IPv6 socket and people may rely on this behavior.
The other tunnels (geneve and fou) do not support IPv6.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a43a9ef6

ip_tunnels: record IP version in tunnel info · 7f9562a1

由 Jiri Benc 提交于 8月 28, 2015

There's currently nothing preventing directing packets with IPv6
encapsulation data to IPv4 tunnels (and vice versa). If this happens,
IPv6 addresses are incorrectly interpreted as IPv4 ones.

Track whether the given ip_tunnel_key contains IPv4 or IPv6 data. Store this
in ip_tunnel_info. Reject packets at appropriate places if they are supposed
to be encapsulated into an incompatible protocol.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7f9562a1

ip_tunnels: convert the mode field of ip_tunnel_info to flags · 46fa062a

由 Jiri Benc 提交于 8月 28, 2015

The mode field holds a single bit of information only (whether the
ip_tunnel_info struct is for rx or tx). Change the mode field to bit flags.
This allows more mode flags to be added.
Signed-off-by: NJiri Benc <jbenc@redhat.com>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NPravin B Shelar <pshelar@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

46fa062a

net: FIB tracepoints · f6d3c192

由 David Ahern 提交于 8月 28, 2015

A few useful tracepoints developing VRF driver.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f6d3c192

29 8月, 2015 8 次提交

netlink: add NETLINK_CAP_ACK socket option · 0a6a3a23

由 Christophe Ricard 提交于 8月 28, 2015

Since commit c05cdb1b ("netlink: allow large data transfers from
user-space"), the kernel may fail to allocate the necessary room for the
acknowledgment message back to userspace. This patch introduces a new
socket option that trims off the payload of the original netlink message.

The netlink message header is still included, so the user can guess from
the sequence number what is the message that has triggered the
acknowledgment.
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: NChristophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a6a3a23

lib: introduce strncpy_from_unsafe() · 1a6877b9

由 Alexei Starovoitov 提交于 8月 28, 2015

generalize FETCH_FUNC_NAME(memory, string) into
strncpy_from_unsafe() and fix sparse warnings that were
present in original implementation.
Signed-off-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1a6877b9

net: Add support for VRFs to inetpeer cache · 192132b9

由 David Ahern 提交于 8月 27, 2015

inetpeer caches based on address only, so duplicate IP addresses within
a namespace return the same cached entry. Enhance the ipv4 address key
to contain both the IPv4 address and VRF device index.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

192132b9

net: Refactor inetpeer address struct · 5345c2e1

由 David Ahern 提交于 8月 27, 2015

Move the inetpeer_addr_base union to inetpeer_addr and drop
inetpeer_addr_base.

Both the a6 and in6_addr overlays are not needed; drop the __be32 version
and rename in6 to a6 for consistency with ipv4. Add a new u32 array to
the union which removes the need for the typecast in the compare function
and the use of a consistent arg for both ipv4 and ipv6 addresses which
makes the compare function more readable.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5345c2e1

net: Add helper function to compare inetpeer addresses · d39d14ff

由 David Ahern 提交于 8月 27, 2015

tcp_metrics and inetpeer both have functions to compare inetpeer
addresses. Consolidate into 1 version.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d39d14ff

net: Add set,get helpers for inetpeer addresses · 3abef286

由 David Ahern 提交于 8月 27, 2015

Use inetpeer set,get helpers in tcp_metrics rather than peeking into
the inetpeer_addr struct.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3abef286

net: Introduce ipv4_addr_hash and use it for tcp metrics · 72afa352

由 David Ahern 提交于 8月 27, 2015

Refactors a common line into helper function.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

72afa352

IGMP: Inhibit reports for local multicast groups · df2cf4a7

由 Philip Downey 提交于 8月 27, 2015

The range of addresses between 224.0.0.0 and 224.0.0.255 inclusive, is
reserved for the use of routing protocols and other low-level topology
discovery or maintenance protocols, such as gateway discovery and
group membership reporting.  Multicast routers should not forward any
multicast datagram with destination addresses in this range,
regardless of its TTL.

Currently, IGMP reports are generated for this reserved range of
addresses even though a router will ignore this information since it
has no purpose.  However, the presence of reserved group addresses in
an IGMP membership report uses up network bandwidth and can also
obscure addresses of interest when inspecting membership reports using
packet inspection or debug messages.

Although the RFCs for the various version of IGMP (e.g.RFC 3376 for
v3) do not specify that the reserved addresses be excluded from
membership reports, it should do no harm in doing so.  In particular
there should be no adverse effect in any IGMP snooping functionality
since 224.0.0.x is specifically excluded as per RFC 4541 (IGMP and MLD
Snooping Switches Considerations) section 2.1.2. Data Forwarding
Rules:

    2) Packets with a destination IP (DIP) address in the 224.0.0.X
       range which are not IGMP must be forwarded on all ports.

IGMP reports for local multicast groups can now be optionally
inhibited by means of a system control variable (by setting the value
to zero) e.g.:
    echo 0 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports

To retain backwards compatibility the previous behaviour is retained
by default on system boot or reverted by setting the value back to
non-zero e.g.:
    echo 1 >  /proc/sys/net/ipv4/igmp_link_local_mcast_reports
Signed-off-by: NPhilip Downey <pdowney@brocade.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df2cf4a7

28 8月, 2015 17 次提交

net: sched: register noqueue qdisc · d66d6c31

由 Phil Sutter 提交于 8月 27, 2015

This way users can attach noqueue just like any other qdisc using tc
without having to mess with tx_queue_len first.
Signed-off-by: NPhil Sutter <phil@nwl.cc>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d66d6c31

netfilter: Define v6ops in !CONFIG_NETFILTER case. · 2e4cfae2

由 Joe Stringer 提交于 8月 27, 2015

When CONFIG_OPENVSWITCH is set, and CONFIG_NETFILTER is not set, the
openvswitch IPv6 fragmentation handling cannot refer to ipv6_ops because
it isn't defined. Add a dummy version to avoid #ifdefs in source files.

Fixes: 7f8a436e "openvswitch: Add conntrack action"
Signed-off-by: NJoe Stringer <joestringer@nicira.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2e4cfae2

net: kill long time unused bonding private flags · 0dc1549b

由 Jiri Pirko 提交于 8月 27, 2015

We don't use them for years, just kill them now.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0dc1549b

net: add netif_is_ovs_master helper with IFF_OPENVSWITCH private flag · 35d4e172

由 Jiri Pirko 提交于 8月 27, 2015

Add this helper so code can easily figure out if netdev is openswitch.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

35d4e172

net: add netif_is_bridge_master helper · 0894ae3f

由 Jiri Pirko 提交于 8月 27, 2015

Add this helper so code can easily figure out if netdev is a bridge.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0894ae3f

net: introduce change upper device notifier change info · 0e4ead9d

由 Jiri Pirko 提交于 8月 27, 2015

Add info that is passed along with NETDEV_CHANGEUPPER event.
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0e4ead9d

geneve: Consolidate Geneve functionality in single module. · 371bd106

由 Pravin B Shelar 提交于 8月 26, 2015

geneve_core module handles send and receive functionality.
This way OVS could use the Geneve API. Now with use of
tunnel meatadata mode OVS can directly use Geneve netdevice.
So there is no need for separate module for Geneve. Following
patch consolidates Geneve protocol processing in single module.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Reviewed-by: NJesse Gross <jesse@nicira.com>
Acked-by: NJohn W. Linville <linville@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

371bd106

geneve: Add support to collect tunnel metadata. · e305ac6c

由 Pravin B Shelar 提交于 8月 26, 2015

Following patch create new tunnel flag which enable
tunnel metadata collection on given device. These devices
can be used by tunnel metadata based routing or by OVS.
Geneve Consolidation patch get rid of collect_md_tun to
simplify tunnel lookup further.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Reviewed-by: NJesse Gross <jesse@nicira.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e305ac6c

geneve: Make dst-port configurable. · cd7918b3

由 Pravin B Shelar 提交于 8月 26, 2015

Add netlink interface to configure Geneve UDP port number.
So that user can configure it for a Gevene device.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Reviewed-by: NJesse Gross <jesse@nicira.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NJohn W. Linville <linville@tuxdriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cd7918b3

tunnel: introduce udp_tun_rx_dst() · c29a70d2

由 Pravin B Shelar 提交于 8月 26, 2015

Introduce function udp_tun_rx_dst() to initialize tunnel dst on
receive path.
Signed-off-by: NPravin B Shelar <pshelar@nicira.com>
Reviewed-by: NJesse Gross <jesse@nicira.com>
Acked-by: NThomas Graf <tgraf@suug.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c29a70d2

bridge: Add netlink support for vlan_protocol attribute · d2d427b3

由 Toshiaki Makita 提交于 8月 27, 2015

This enables bridge vlan_protocol to be configured through netlink.

When CONFIG_BRIDGE_VLAN_FILTERING is disabled, kernel behaves the
same way as this feature is not implemented.
Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d2d427b3

net: sched: consolidate tc_classify{,_compat} · 3b3ae880

由 Daniel Borkmann 提交于 8月 26, 2015

For classifiers getting invoked via tc_classify(), we always need an
extra function call into tc_classify_compat(), as both are being
exported as symbols and tc_classify() itself doesn't do much except
handling of reclassifications when tp->classify() returned with
TC_ACT_RECLASSIFY.

CBQ and ATM are the only qdiscs that directly call into tc_classify_compat(),
all others use tc_classify(). When tc actions are being configured
out in the kernel, tc_classify() effectively does nothing besides
delegating.

We could spare this layer and consolidate both functions. pktgen on
single CPU constantly pushing skbs directly into the netif_receive_skb()
path with a dummy classifier on ingress qdisc attached, improves
slightly from 22.3Mpps to 23.1Mpps.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@plumgrid.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b3ae880

openvswitch: Allow attaching helpers to ct action · cae3a262