提交 · d4f4b915829bb5b51b42ee909e6ea3302a715550 · openeuler / Kernel

03 4月, 2017 7 次提交

Merge branch 'rds-minor-bug-fixes' · d4f4b915

由 David S. Miller 提交于 4月 02, 2017

Sowmini Varadhan says:

====================
rds: tcp: couple of minor bug fixes

A couple of minor bugfixes that showed up during testing
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4f4b915

rds: tcp: canonical connection order for all paths with index > 0 · 087d9753

由 Sowmini Varadhan 提交于 3月 31, 2017

The rds_connect_worker() has a bug in the check that enforces the
canonical connection order described in the comments of
rds_tcp_state_change(). The intention is to make sure that all
the multipath connections are always initiated by the smaller IP
address via rds_start_mprds. To achieve this, rds_connection_worker
should check that cp_index > 0.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

087d9753

rds: tcp: allow progress of rds_conn_shutdown if the rds_connection is marked... · e97656d0

由 Sowmini Varadhan 提交于 3月 31, 2017

rds: tcp: allow progress of rds_conn_shutdown if the rds_connection is marked ERROR by an intervening FIN

rds_conn_shutdown() runs in workq context, and marks the rds_connection
as DISCONNECTING before quiescing Tx/Rx paths. However, after all I/O
has quiesced, we may still find the rds_connection state to be
RDS_CONN_ERROR if an intervening FIN was processed in softirq context.

This is not a fatal error: rds_conn_shutdown() should continue the
shutdown, and there is no need to log noisy messages about this event.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e97656d0

sock: correctly test SOCK_TIMESTAMP in sock_recv_ts_and_drops() · d3fbff30

由 Eric Dumazet 提交于 3月 31, 2017

It seems the code does not match the intent.

This broke packetdrill, and probably other programs.

Fixes: 6c7c98ba ("sock: avoid dirtying sk_stamp, if possible")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Acked-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d3fbff30

drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c: fix build with gcc-4.4.4 · e270e966

由 Andrew Morton 提交于 3月 31, 2017

drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c: In function 'mlx5e_set_rxfh':
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: error: unknown field 'rss' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: warning: missing braces around initializer
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: warning: (near initialization for 'rrp.<anonymous>')
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1068: error: unknown field 'rss' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1069: warning: excess elements in struct initializer
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1069: warning: (near initialization for 'rrp')

gcc-4.4.4 has issues with anonymous union initializers. Work around this.

Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e270e966

drivers/net/ethernet/mellanox/mlx5/core/en_main.c: fix build with gcc-4.4.4 · 95632791

由 Andrew Morton 提交于 3月 31, 2017

drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts':
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2210: error: unknown field 'rqn' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2211: warning: missing braces around initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2211: warning: (near initialization for 'direct_rrp.<anonymous>')
drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts_to_channels':
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: error: unknown field 'rss' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: missing braces around initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: (near initialization for 'rrp.<anonymous>')
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: initialization makes integer from pointer without a cast
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2228: error: unknown field 'rss' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2229: warning: excess elements in struct initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2229: warning: (near initialization for 'rrp')
drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts_to_drop':
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2238: error: unknown field 'rqn' specified in initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2239: warning: missing braces around initializer
drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2239: warning: (near initialization for 'drop_rrp.<anonymous>')

gcc-4.4.4 has issues with anonymous union initializers. Work around this.

Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95632791

net: stmmac: fix cbs configuration · 44781fef

由 Joao Pinto 提交于 3月 31, 2017

Sending again, because forgot to include net-dev.

The QoS IP does not accept AVB capabilities to default/queue 0, this way we
guarantee 75% bandwidth for AVB. This patch assures that only queues >= 1
gets CBS confgured. Additional info was also added to stmmac.txt.
Reported-by: NNiklas Cassel <niklas.cassel@axis.com>
Signed-off-by: NJoao Pinto <jpinto@synopsys.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

44781fef

02 4月, 2017 33 次提交

Merge branch 'mpls-more-labels' · a6fc09df

由 David S. Miller 提交于 4月 01, 2017

David Ahern says:

====================
net: mpls: Allow users to configure more labels per route

Increase the maximum number of new labels for MPLS routes from 2 to 30.

To keep memory consumption in check, the labels array is moved to the end
of mpls_nh and mpls_iptunnel_encap structs as a 0-sized array. Allocations
use the maximum number of labels across all nexthops in a route for LSR
and the number of labels configured for LWT.

The mpls_route layout is changed to:

   +----------------------+
   | mpls_route           |
   +----------------------+
   | mpls_nh 0            |
   +----------------------+
   | alignment padding    |   4 bytes for odd number of labels; 0 for even
   +----------------------+
   | via[rt_max_alen] 0   |
   +----------------------+
   | alignment padding    |   via's aligned on sizeof(unsigned long)
   +----------------------+
   | ...                  |

Meaning the via follows its mpls_nh providing better locality as the
number of labels increases. UDP_RR tests with namespaces shows no impact
to a modest performance increase with this layout for 1 or 2 labels and
1 or 2 nexthops.

mpls_route allocation size is limited to 4096 bytes allowing on the
order of 30 nexthops with 30 labels (or more nexthops with fewer
labels). LWT encap shares same maximum number of labels as mpls routing.

v3
- initialize n_labels to 0 in case RTA_NEWDST is not defined; detected
  by the kbuild test robot

v2
- updates per Eric's comments
  + added patch to ensure all reads of rt_nhn_alive and nh_flags in
    the packet path use READ_ONCE and all writes via event handlers
    use WRITE_ONCE

  + limit mpls_route size to 4096 (PAGE_SIZE for most arch)

  + mostly killed use of MAX_NEW_LABELS; it exists only for common
    limit between lwt and routing paths
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6fc09df

net: mpls: Increase max number of labels for lwt encap · 1511009c

由 David Ahern 提交于 3月 31, 2017

Alow users to push down more labels per MPLS encap. Similar to LSR case,
move label array to the end of mpls_iptunnel_encap and allocate based on
the number of labels for the route.

For consistency with the LSR case, re-use the same maximum number of
labels.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1511009c

net: mpls: bump maximum number of labels · a4ac8c98

由 David Ahern 提交于 3月 31, 2017

Allow users to push down more labels per MPLS route. With the previous
patches, no memory allocations are based on MAX_NEW_LABELS; the limit
is only used to keep userspace in check.

At this point MAX_NEW_LABELS is only used for mpls_route_config (copying
route data from userspace) and processing nexthops looking for the max
number of labels across the route spec.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a4ac8c98

net: mpls: Limit memory allocation for mpls_route · df1c6316

由 David Ahern 提交于 3月 31, 2017

Limit memory allocation size for mpls_route to 4096.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

df1c6316

net: mpls: change mpls_route layout · 59b20966

由 David Ahern 提交于 3月 31, 2017

Move labels to the end of mpls_nh as a 0-sized array and within mpls_route
move the via for a nexthop after the mpls_nh. The new layout becomes:

   +----------------------+
   | mpls_route           |
   +----------------------+
   | mpls_nh 0            |
   +----------------------+
   | alignment padding    |   4 bytes for odd number of labels; 0 for even
   +----------------------+
   | via[rt_max_alen] 0   |
   +----------------------+
   | alignment padding    |   via's aligned on sizeof(unsigned long)
   +----------------------+
   | ...                  |
   +----------------------+
   | mpls_nh n-1          |
   +----------------------+
   | via[rt_max_alen] n-1 |
   +----------------------+

Memory allocated for nexthop + via is constant across all nexthops and
their via. It is based on the maximum number of labels across all nexthops
and the maximum via length. The size is saved in the mpls_route as
rt_nh_size. Accessing a nexthop becomes rt->rt_nh + index * rt->rt_nh_size.

The offset of the via address from a nexthop is saved as rt_via_offset
so that given an mpls_nh pointer the via for that hop is simply
nh + rt->rt_via_offset.

With prior code, memory allocated per mpls_route with 1 nexthop:
     via is an ethernet address - 64 bytes
     via is an ipv4 address     - 64
     via is an ipv6 address     - 72

With this patch set, memory allocated per mpls_route with 1 nexthop and
1 or 2 labels:
     via is an ethernet address - 56 bytes
     via is an ipv4 address     - 56
     via is an ipv6 address     - 64

The 8-byte reduction is due to the previous patch; the change introduced
by this patch has no impact on the size of allocations for 1 or 2 labels.

Performance impact of this change was examined using network namespaces
with veth pairs connecting namespaces. ns0 inserts the packet to the
label-switched path using an lwt route with encap mpls. ns1 adds 1 or 2
labels depending on test, ns2 (and ns3 for 2-label test) pops the label
and forwards. ns3 (or ns4) for a 2-label is the destination. Similar
series of namespaces used for 2-nexthop test.

Intent is to measure changes to latency (overhead in manipulating the
packet) in the forwarding path. Tests used netperf with UDP_RR.

IPv4:                     current   patches
   1 label, 1 nexthop      29908     30115
   2 label, 1 nexthop      29071     29612
   1 label, 2 nexthop      29582     29776
   2 label, 2 nexthop      29086     29149

IPv6:                     current   patches
   1 label, 1 nexthop      24502     24960
   2 label, 1 nexthop      24041     24407
   1 label, 2 nexthop      23795     23899
   2 label, 2 nexthop      23074     22959

In short, the change has no effect to a modest increase in performance.
This is expected since this patch does not really have an impact on routes
with 1 or 2 labels (the current limit) and 1 or 2 nexthops.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

59b20966

net: mpls: Convert number of nexthops to u8 · 77ef013a

由 David Ahern 提交于 3月 31, 2017

Number of nexthops and number of alive nexthops are tracked using an
unsigned int. A route should never have more than 255 nexthops so
convert both to u8. Update all references and intermediate variables
to consistently use u8 as well.

Shrinks the size of mpls_route from 32 bytes to 24 bytes with a 2-byte
hole before the nexthops.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

77ef013a

net: mpls: rt_nhn_alive and nh_flags should be accessed using READ_ONCE · 39eb8cd1

由 David Ahern 提交于 3月 31, 2017

The number of alive nexthops for a route (rt->rt_nhn_alive) and the
flags for a next hop (nh->nh_flags) are modified by netdev event
handlers. The event handlers run with rtnl_lock held so updates are
always done with the lock held. The packet path accesses the fields
under the rcu lock. Since those fields can change at any moment in
the packet path, both fields should be accessed using READ_ONCE. Updates
to both fields should use WRITE_ONCE.

Update mpls_select_multipath (packet path) and mpls_ifdown and mpls_ifup
(event handlers) accordingly.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

39eb8cd1

udp: use sk_protocol instead of pcflag to detect udplite sockets · 3d8417d7

由 Paolo Abeni 提交于 3月 31, 2017

In the udp_sock struct, the 'forward_deficit' and 'pcflag' fields
share the same cacheline. While the first is dirtied by
udp_recvmsg, the latter is read, possibly several times, by the
bottom half processing to discriminate between udp and udplite
sockets.

With this patch, sk->sk_protocol is used to check is the socket is
really an udplite one, avoiding some cache misses per
packet and improving the performance under udp_flood with
small packet up to 10%.
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d8417d7

net: dsa: fix build error with devlink build as module · 768bfa2a

由 Tobias Regnery 提交于 3月 31, 2017

After commit 96567d5d ("net: dsa: dsa2: Add basic support of devlink")
I see the following link error with CONFIG_NET_DSA=y and CONFIG_NET_DEVLINK=m:

net/built-in.o: In function 'dsa_register_switch':
(.text+0xe226b): undefined reference to `devlink_alloc'
net/built-in.o: In function 'dsa_register_switch':
(.text+0xe2284): undefined reference to `devlink_register'
net/built-in.o: In function 'dsa_register_switch':
(.text+0xe243e): undefined reference to `devlink_port_register'
net/built-in.o: In function 'dsa_register_switch':
(.text+0xe24e1): undefined reference to `devlink_port_register'
net/built-in.o: In function 'dsa_register_switch':
(.text+0xe24fa): undefined reference to `devlink_port_type_eth_set'
net/built-in.o: In function 'dsa_dst_unapply.part.8':
dsa2.c:(.text.unlikely+0x345): undefined reference to 'devlink_port_unregister'
dsa2.c:(.text.unlikely+0x36c): undefined reference to 'devlink_port_unregister'
dsa2.c:(.text.unlikely+0x38e): undefined reference to 'devlink_port_unregister'
dsa2.c:(.text.unlikely+0x3f2): undefined reference to 'devlink_unregister'
dsa2.c:(.text.unlikely+0x3fb): undefined reference to 'devlink_free'

Fix this by adding a dependency on MAY_USE_DEVLINK so that CONFIG_NET_DSA
get switched to be build as module when CONFIG_NET_DEVLINK=m.

Fixes: 96567d5d ("net: dsa: dsa2: Add basic support of devlink")
Signed-off-by: NTobias Regnery <tobias.regnery@gmail.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

768bfa2a

Merge branch 'phylib-EEE-updates' · 88f913f5

由 David S. Miller 提交于 4月 01, 2017

Russell King says:

====================
phylib EEE updates

This series of patches depends on the previous set of changes, and is
therefore net-next material.

While testing the EEE code, I discovered a number of issues:

1. It is possible to enable advertisment of EEE modes which are not
   supported by the hardware.  We omit to check the supported modes
   and mask off those modes that are not supported before writing the
   EEE advertisment register.

2. We need to restart autonegotiation after a change of the EEE
   advertisment, otherwise the link partner does not see the updated
   EEE modes.

3. SGMII connected PHYs are also capable of supporting EEE.

Through discussion with Florian, it has been decided to remove the check
for the PHY interface mode in patch (3).
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

88f913f5

net: phy: allow EEE with any interface mode · 32d75141

由 Russell King 提交于 3月 31, 2017

EEE is able to work in any PHY interface mode, there is nothing which
fundamentally restricts it to only a few modes.  For example, EEE works
in SGMII mode with the Marvell 88E1512.

Rather than just adding SGMII mode to the list, Florian suggests
removing the list of interface modes entirely:

  It actually sounds like we should just kill the check entirely,
  it does not appear that any of the interface mode would not
  fundamentally be able to support EEE, because the "lowest" mode
  we support is MII, and even there it's quite possible to support
  EEE.
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

32d75141

net: phy: restart phy autonegotiation after EEE advertisment change · f75abeb8

由 Russell King 提交于 3月 31, 2017

When the EEE advertisment is changed, we should restart autonegotiation
to update the link partner with the new EEE settings.  Add this trigger
but only if the advertisment has changed.
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f75abeb8

net: phy: avoid setting unsupported EEE advertisments · 83ea067f

由 Russell King 提交于 3月 31, 2017

We currently allow userspace to set any EEE advertisments it desires,
whether or not the PHY supports them.  For example:

 # ethtool --set-eee eth1 advertise 0xffffffff
 # ethtool --show-eee eth1
 EEE Settings for eth1:
        EEE status: disabled
        Tx LPI: disabled
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
                                   10000baseT/Full
        Advertised EEE link modes:  100baseT/Full
                                    1000baseT/Full
                                    1000baseKX/Full
                                    10000baseT/Full
                                    10000baseKX4/Full
                                    10000baseKR/Full

Clearly, this is not sane, we should only allow link modes that are
supported to be advertised (as we do elsewhere.)  Ensure that we mask
the MDIO_AN_EEE_ADV value with the capabilities retrieved from the
MDIO_PCS_EEE_ABLE register.
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

83ea067f

Merge branch 'bpf-prog-testing-framework' · eefe06e8

由 David S. Miller 提交于 4月 01, 2017

Alexei Starovoitov says:

====================
bpf: program testing framework

Development and testing of networking bpf programs is quite cumbersome.
Especially tricky are XDP programs that attach to real netdevices and
program development feels like working on the car engine while
the car is in motion.
Another problem is ongoing changes to upstream llvm core
that can introduce an optimization that verifier will not
recognize. llvm bpf backend tests have no ability to run the programs.
To improve this situation introduce BPF_PROG_TEST_RUN command
to test and performance benchmark bpf programs.
It achieves several goals:
- development of xdp and skb based bpf programs can be done
in a canned environment with unit tests
- program performance optimizations can be benchmarked outside of
networking core (without driver and skb costs)
- continuous testing of upstream changes is finally practical

Patches 4,5,6 add C based test cases of various complexity
to cover some sched_cls and xdp features. More tests will
be added in the future. The tests were run on centos7 only.

For now the framework supports only skb and xdp programs. In the future
it can be extended to socket_filter and tracing program types.

More details are in individual patches.

v1->v2:
- rename bpf_program_test_run->bpf_prog_test_run
- add missing #include <linux/bpf.h> since libbpf.h shouldn't depend
on prior includes
- reordered patches 3 and 4 to keep bisect clean
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eefe06e8

selftests/bpf: add l4 load balancer test based on sched_cls · 37821613

由 Alexei Starovoitov 提交于 3月 30, 2017

this l4lb demo is a comprehensive test case for LLVM codegen and
kernel verifier. It's using fully inlined jhash(), complex packet
parsing and multiple map lookups of different types to stress
llvm and verifier.
The map sizes, map population and test vectors are artificial to
exercise different paths through the bpf program.
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37821613

selftests/bpf: add a test for basic XDP functionality · 8d48f5e4

由 Alexei Starovoitov 提交于 3月 30, 2017

add C test for xdp_adjust_head(), packet rewrite and map lookups
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d48f5e4

selftests/bpf: add a test for overlapping packet range checks · 6882804c

由 Alexei Starovoitov 提交于 3月 30, 2017

add simple C test case for llvm and verifier range check fix from
commit b1977682 ("bpf: improve verifier packet range checks")
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6882804c

tools/lib/bpf: expose bpf_program__set_type() · dd26b7f5

由 Alexei Starovoitov 提交于 3月 30, 2017

expose bpf_program__set_type() to set program type
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dd26b7f5

tools/lib/bpf: add support for BPF_PROG_TEST_RUN command · 30848873

由 Alexei Starovoitov 提交于 3月 30, 2017

add support for BPF_PROG_TEST_RUN command to libbpf.a
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NWang Nan <wangnan0@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30848873

bpf: introduce BPF_PROG_TEST_RUN command · 1cf1cae9

由 Alexei Starovoitov 提交于 3月 30, 2017

development and testing of networking bpf programs is quite cumbersome.
Despite availability of user space bpf interpreters the kernel is
the ultimate authority and execution environment.
Current test frameworks for TC include creation of netns, veth,
qdiscs and use of various packet generators just to test functionality
of a bpf program. XDP testing is even more complicated, since
qemu needs to be started with gro/gso disabled and precise queue
configuration, transferring of xdp program from host into guest,
attaching to virtio/eth0 and generating traffic from the host
while capturing the results from the guest.

Moreover analyzing performance bottlenecks in XDP program is
impossible in virtio environment, since cost of running the program
is tiny comparing to the overhead of virtio packet processing,
so performance testing can only be done on physical nic
with another server generating traffic.

Furthermore ongoing changes to user space control plane of production
applications cannot be run on the test servers leaving bpf programs
stubbed out for testing.

Last but not least, the upstream llvm changes are validated by the bpf
backend testsuite which has no ability to test the code generated.

To improve this situation introduce BPF_PROG_TEST_RUN command
to test and performance benchmark bpf programs.

Joint work with Daniel Borkmann.
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1cf1cae9

net: dsa: Mock-up driver · 98cd1552

由 Florian Fainelli 提交于 3月 30, 2017

This patch adds support for a DSA mock-up driver which essentially does
the following:

- registers/unregisters 4 fixed PHYs to the slave network devices
- uses eth0 (configurable) as the master netdev
- registers the switch as a fixed MDIO device against the fixed MDIO bus
  at address 31
- includes dynamic debug prints for dsa_switch_ops functions that can be
  enabled to get call traces

This is a good way to test modular builds as well as exercise the DSA
APIs without requiring access to real hardware. This does not test the
data-path, although this could be added later on.
Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98cd1552

Merge branch 'mv88e6xxx-cross-chip-bridging' · 772c3bda

由 David S. Miller 提交于 4月 01, 2017

Vivien Didelot says:

====================
net: dsa: mv88e6xxx: program cross-chip bridging

The purpose of this patch series is to bring hardware cross-chip
bridging configuration to the DSA layer and the mv88e6xxx DSA driver.

Most recent Marvell switch chips have a Cross-chip Port Based VLAN Table
(PVT) used to restrict to which internal destination port an arbitrary
external source port is allowed to egress frames to.

The current behavior of the mv88e6xxx driver is to program this table
table with all ones, allowing any external ports to egress frames on any
internal ports. This means that carefully crafted Ethernet frames can
potentially bypass the user bridging configuration.

Patches 1 to 7 prepare the setup of this table and factorize the common
bits of both in-chip and cross-chip Marvell bridging code.

Patch 8 adds new optional cross-chip bridging operations to DSA switch.

Patch 9 switches the current behavior to program the table according to
the user bridging configuration when (cross-chip) ports get (un)bridged.

On a ZII Rev B board, bridging together the 3 user ports of both 88E6352
will result in the following PVTs on respectively switch 0 and switch 1:

    External   Internal Ports
    Dev Port   0  1  2  3  4  5  6

     1    0    *  *  *  -  -  *  *
     1    1    *  *  *  -  -  *  *
     1    2    *  *  *  -  -  *  *
     1    3    -  -  -  -  -  *  *
     1    4    -  -  -  -  -  *  *
     1    5    *  *  *  *  *  *  *
     1    6    *  *  *  *  *  *  *

     0    0    *  *  *  -  -  *  *
     0    1    *  *  *  -  -  *  *
     0    2    *  *  *  -  -  *  *
     0    3    -  -  -  -  -  *  *
     0    4    -  -  -  -  -  *  *
     0    5    *  *  *  *  *  *  *
     0    6    *  *  *  *  *  *  *

Changes since v2:
  - Define MV88E6XXX_MAX_PVT_SWITCHES and MV88E6XXX_MAX_PVT_PORTS
  - use mv88e6xxx_g2_misc_4_bit_port instead of the 5-bit variant
  - add Andrew's tags and reword commit 6/9
====================
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

772c3bda

net: dsa: mv88e6xxx: add cross-chip bridging · aec5ac88

由 Vivien Didelot 提交于 3月 30, 2017

Implement the DSA cross-chip bridging operations by remapping the local
ports an external source port can egress frames to, when this cross-chip
port joins or leaves a bridge.

The PVT is no longer configured with all ones allowing any external
frame to egress any local port. Only DSA and CPU ports, as well as
bridge group members, can egress frames on local ports.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aec5ac88

net: dsa: add cross-chip bridging operations · 40ef2c93

由 Vivien Didelot 提交于 3月 30, 2017

Introduce crosschip_bridge_{join,leave} operations in the dsa_switch_ops
structure, which can be used by switches supporting interconnection.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

40ef2c93

net: dsa: mv88e6xxx: remap existing bridge members · e96a6e02

由 Vivien Didelot 提交于 3月 30, 2017

When a local port of a switch chip becomes a member of a bridge group,
we need to reprogram the Cross-chip Port Based VLAN Table (PVT) to allow
existing cross-chip bridge members to egress frames on the new ports.

There is no functional changes yet, since the PVT is still programmed
with all ones, allowing any external port to egress frames locally.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e96a6e02

net: dsa: mv88e6xxx: factorize in-chip bridge map · 240ea3ef

由 Vivien Didelot 提交于 3月 30, 2017

Factorize the code in the DSA port_bridge_{join,leave} routines used to
program the port VLAN map of all local ports of a given bridge group.

At the same time shorten the _mv88e6xxx_port_based_vlan_map to get rid
of the old underscore prefix naming convention.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

240ea3ef

net: dsa: mv88e6xxx: rework in-chip bridging · e5887a2a

由 Vivien Didelot 提交于 3月 30, 2017

All ports -- internal and external, for chips featuring a PVT -- have a
mask restricting to which internal ports a frame is allowed to egress.

Now that DSA exposes the number of ports and their bridge devices, it is
possible to extract the code generating the VLAN map and make it generic
so that it can be shared later with the cross-chip bridging code.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5887a2a

net: dsa: mv88e6xxx: allocate the number of ports · 73b1204d

由 Vivien Didelot 提交于 3月 30, 2017

The current code allocates DSA_MAX_PORTS ports for a Marvell dsa_switch
structure. Provide the exact number of ports so the corresponding
ds->num_ports is accurate.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73b1204d

net: dsa: mv88e6xxx: program the PVT with all ones · 17a1594e

由 Vivien Didelot 提交于 3月 30, 2017

The Cross-chip Port Based VLAN Table (PVT) is currently initialized with
all ones, allowing any external ports to egress frames on local ports.

This commit implements the PVT access functions and programs the PVT
with all ones for the local switch ports only, instead of using the Init
operation. The current behavior is unchanged for the moment.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

17a1594e

net: dsa: mv88e6xxx: use 4-bit port for PVT data · 81228996

由 Vivien Didelot 提交于 3月 30, 2017

The Cross-chip Port Based VLAN Table (PVT) supports two indexing modes,
one using 5-bit for device and 4-bit for port, the other using 4-bit for
device and 5-bit for port, configured via the Global 2 Misc register.

Only 4 bits for the source port are needed when interconnecting 88E6xxx
switch devices since they all support less than 16 physical ports. The
full 5 bits are needed when interconnecting a device with 98DXxxx switch
devices since they support more than 16 physical ports.

Add a mv88e6xxx_pvt_setup helper to set the 4-bit port PVT mode, which
will be extended later to also initialize the PVT content.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

81228996

net: dsa: mv88e6xxx: move PVT description in info · f3645652

由 Vivien Didelot 提交于 3月 30, 2017

Not all Marvell switch chips feature a Cross-chip Port VLAN Table (PVT).

Chips with a PVT use the same implementation, so a new mv88e6xxx_ops
member won't be necessary yet. Add a "pvt" boolean member to the
mv88e6xxx_info structure and kill the obsolete MV88E6XXX_FLAGS_PVT flag.

Add a mv88e6xxx_has_pvt helper to wrap future checks of that condition.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f3645652

dpaa_eth: use AVOIDBLOCK for Tx confirmation queues · 58b7bd0f

由 Madalin Bucur 提交于 3月 30, 2017

The AVOIDBLOCK flag determines the Tx confirmation queues processing
to be redirected to any available CPU when the current one is slow
in processing them. This may result in a higher Tx confirmation
interrupt count but may reduce pressure on a certain CPU that with
the previous setting would process all Tx confirmation frames.
Signed-off-by: NMadalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58b7bd0f

fsl/fman: take into account all RGMII modes · b07e675b

由 Madalin Bucur 提交于 3月 30, 2017

Accept the internal delay RGMII variants.
Signed-off-by: NMadalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b07e675b

openeuler / Kernel 大约 1 年 前同步成功

openeuler / Kernel
大约 1 年前同步成功