1. 04 4月, 2017 18 次提交
  2. 03 4月, 2017 7 次提交
    • D
      Merge branch 'rds-minor-bug-fixes' · d4f4b915
      David S. Miller 提交于
      Sowmini Varadhan says:
      
      ====================
      rds: tcp: couple of minor bug fixes
      
      A couple of minor bugfixes that showed up during testing
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d4f4b915
    • S
      rds: tcp: canonical connection order for all paths with index > 0 · 087d9753
      Sowmini Varadhan 提交于
      The rds_connect_worker() has a bug in the check that enforces the
      canonical connection order described in the comments of
      rds_tcp_state_change(). The intention is to make sure that all
      the multipath connections are always initiated by the smaller IP
      address via rds_start_mprds. To achieve this, rds_connection_worker
      should check that cp_index > 0.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      087d9753
    • S
      rds: tcp: allow progress of rds_conn_shutdown if the rds_connection is marked... · e97656d0
      Sowmini Varadhan 提交于
      rds: tcp: allow progress of rds_conn_shutdown if the rds_connection is marked ERROR by an intervening FIN
      
      rds_conn_shutdown() runs in workq context, and marks the rds_connection
      as DISCONNECTING before quiescing Tx/Rx paths. However, after all I/O
      has quiesced, we may still find the rds_connection state to be
      RDS_CONN_ERROR if an intervening FIN was processed in softirq context.
      
      This is not a fatal error: rds_conn_shutdown() should continue the
      shutdown, and there is no need to log noisy messages about this event.
      Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e97656d0
    • E
      sock: correctly test SOCK_TIMESTAMP in sock_recv_ts_and_drops() · d3fbff30
      Eric Dumazet 提交于
      It seems the code does not match the intent.
      
      This broke packetdrill, and probably other programs.
      
      Fixes: 6c7c98ba ("sock: avoid dirtying sk_stamp, if possible")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Acked-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d3fbff30
    • A
      drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c: fix build with gcc-4.4.4 · e270e966
      Andrew Morton 提交于
      drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c: In function 'mlx5e_set_rxfh':
      drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: error: unknown field 'rss' specified in initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: warning: missing braces around initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1067: warning: (near initialization for 'rrp.<anonymous>')
      drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1068: error: unknown field 'rss' specified in initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1069: warning: excess elements in struct initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c:1069: warning: (near initialization for 'rrp')
      
      gcc-4.4.4 has issues with anonymous union initializers.  Work around this.
      
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e270e966
    • A
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c: fix build with gcc-4.4.4 · 95632791
      Andrew Morton 提交于
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts':
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2210: error: unknown field 'rqn' specified in initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2211: warning: missing braces around initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2211: warning: (near initialization for 'direct_rrp.<anonymous>')
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts_to_channels':
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: error: unknown field 'rss' specified in initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: missing braces around initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: (near initialization for 'rrp.<anonymous>')
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2227: warning: initialization makes integer from pointer without a cast
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2228: error: unknown field 'rss' specified in initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2229: warning: excess elements in struct initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2229: warning: (near initialization for 'rrp')
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c: In function 'mlx5e_redirect_rqts_to_drop':
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2238: error: unknown field 'rqn' specified in initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2239: warning: missing braces around initializer
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c:2239: warning: (near initialization for 'drop_rrp.<anonymous>')
      
      gcc-4.4.4 has issues with anonymous union initializers.  Work around this.
      
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      95632791
    • J
      net: stmmac: fix cbs configuration · 44781fef
      Joao Pinto 提交于
      Sending again, because forgot to include net-dev.
      
      The QoS IP does not accept AVB capabilities to default/queue 0, this way we
      guarantee 75% bandwidth for AVB. This patch assures that only queues >= 1
      gets CBS confgured. Additional info was also added to stmmac.txt.
      Reported-by: NNiklas Cassel <niklas.cassel@axis.com>
      Signed-off-by: NJoao Pinto <jpinto@synopsys.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44781fef
  3. 02 4月, 2017 15 次提交
    • D
      Merge branch 'mpls-more-labels' · a6fc09df
      David S. Miller 提交于
      David Ahern says:
      
      ====================
      net: mpls: Allow users to configure more labels per route
      
      Increase the maximum number of new labels for MPLS routes from 2 to 30.
      
      To keep memory consumption in check, the labels array is moved to the end
      of mpls_nh and mpls_iptunnel_encap structs as a 0-sized array. Allocations
      use the maximum number of labels across all nexthops in a route for LSR
      and the number of labels configured for LWT.
      
      The mpls_route layout is changed to:
      
         +----------------------+
         | mpls_route           |
         +----------------------+
         | mpls_nh 0            |
         +----------------------+
         | alignment padding    |   4 bytes for odd number of labels; 0 for even
         +----------------------+
         | via[rt_max_alen] 0   |
         +----------------------+
         | alignment padding    |   via's aligned on sizeof(unsigned long)
         +----------------------+
         | ...                  |
      
      Meaning the via follows its mpls_nh providing better locality as the
      number of labels increases. UDP_RR tests with namespaces shows no impact
      to a modest performance increase with this layout for 1 or 2 labels and
      1 or 2 nexthops.
      
      mpls_route allocation size is limited to 4096 bytes allowing on the
      order of 30 nexthops with 30 labels (or more nexthops with fewer
      labels). LWT encap shares same maximum number of labels as mpls routing.
      
      v3
      - initialize n_labels to 0 in case RTA_NEWDST is not defined; detected
        by the kbuild test robot
      
      v2
      - updates per Eric's comments
        + added patch to ensure all reads of rt_nhn_alive and nh_flags in
          the packet path use READ_ONCE and all writes via event handlers
          use WRITE_ONCE
      
        + limit mpls_route size to 4096 (PAGE_SIZE for most arch)
      
        + mostly killed use of MAX_NEW_LABELS; it exists only for common
          limit between lwt and routing paths
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6fc09df
    • D
      net: mpls: Increase max number of labels for lwt encap · 1511009c
      David Ahern 提交于
      Alow users to push down more labels per MPLS encap. Similar to LSR case,
      move label array to the end of mpls_iptunnel_encap and allocate based on
      the number of labels for the route.
      
      For consistency with the LSR case, re-use the same maximum number of
      labels.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1511009c
    • D
      net: mpls: bump maximum number of labels · a4ac8c98
      David Ahern 提交于
      Allow users to push down more labels per MPLS route. With the previous
      patches, no memory allocations are based on MAX_NEW_LABELS; the limit
      is only used to keep userspace in check.
      
      At this point MAX_NEW_LABELS is only used for mpls_route_config (copying
      route data from userspace) and processing nexthops looking for the max
      number of labels across the route spec.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4ac8c98
    • D
      net: mpls: Limit memory allocation for mpls_route · df1c6316
      David Ahern 提交于
      Limit memory allocation size for mpls_route to 4096.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df1c6316
    • D
      net: mpls: change mpls_route layout · 59b20966
      David Ahern 提交于
      Move labels to the end of mpls_nh as a 0-sized array and within mpls_route
      move the via for a nexthop after the mpls_nh. The new layout becomes:
      
         +----------------------+
         | mpls_route           |
         +----------------------+
         | mpls_nh 0            |
         +----------------------+
         | alignment padding    |   4 bytes for odd number of labels; 0 for even
         +----------------------+
         | via[rt_max_alen] 0   |
         +----------------------+
         | alignment padding    |   via's aligned on sizeof(unsigned long)
         +----------------------+
         | ...                  |
         +----------------------+
         | mpls_nh n-1          |
         +----------------------+
         | via[rt_max_alen] n-1 |
         +----------------------+
      
      Memory allocated for nexthop + via is constant across all nexthops and
      their via. It is based on the maximum number of labels across all nexthops
      and the maximum via length. The size is saved in the mpls_route as
      rt_nh_size. Accessing a nexthop becomes rt->rt_nh + index * rt->rt_nh_size.
      
      The offset of the via address from a nexthop is saved as rt_via_offset
      so that given an mpls_nh pointer the via for that hop is simply
      nh + rt->rt_via_offset.
      
      With prior code, memory allocated per mpls_route with 1 nexthop:
           via is an ethernet address - 64 bytes
           via is an ipv4 address     - 64
           via is an ipv6 address     - 72
      
      With this patch set, memory allocated per mpls_route with 1 nexthop and
      1 or 2 labels:
           via is an ethernet address - 56 bytes
           via is an ipv4 address     - 56
           via is an ipv6 address     - 64
      
      The 8-byte reduction is due to the previous patch; the change introduced
      by this patch has no impact on the size of allocations for 1 or 2 labels.
      
      Performance impact of this change was examined using network namespaces
      with veth pairs connecting namespaces. ns0 inserts the packet to the
      label-switched path using an lwt route with encap mpls. ns1 adds 1 or 2
      labels depending on test, ns2 (and ns3 for 2-label test) pops the label
      and forwards. ns3 (or ns4) for a 2-label is the destination. Similar
      series of namespaces used for 2-nexthop test.
      
      Intent is to measure changes to latency (overhead in manipulating the
      packet) in the forwarding path. Tests used netperf with UDP_RR.
      
      IPv4:                     current   patches
         1 label, 1 nexthop      29908     30115
         2 label, 1 nexthop      29071     29612
         1 label, 2 nexthop      29582     29776
         2 label, 2 nexthop      29086     29149
      
      IPv6:                     current   patches
         1 label, 1 nexthop      24502     24960
         2 label, 1 nexthop      24041     24407
         1 label, 2 nexthop      23795     23899
         2 label, 2 nexthop      23074     22959
      
      In short, the change has no effect to a modest increase in performance.
      This is expected since this patch does not really have an impact on routes
      with 1 or 2 labels (the current limit) and 1 or 2 nexthops.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59b20966
    • D
      net: mpls: Convert number of nexthops to u8 · 77ef013a
      David Ahern 提交于
      Number of nexthops and number of alive nexthops are tracked using an
      unsigned int. A route should never have more than 255 nexthops so
      convert both to u8. Update all references and intermediate variables
      to consistently use u8 as well.
      
      Shrinks the size of mpls_route from 32 bytes to 24 bytes with a 2-byte
      hole before the nexthops.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      77ef013a
    • D
      net: mpls: rt_nhn_alive and nh_flags should be accessed using READ_ONCE · 39eb8cd1
      David Ahern 提交于
      The number of alive nexthops for a route (rt->rt_nhn_alive) and the
      flags for a next hop (nh->nh_flags) are modified by netdev event
      handlers. The event handlers run with rtnl_lock held so updates are
      always done with the lock held. The packet path accesses the fields
      under the rcu lock. Since those fields can change at any moment in
      the packet path, both fields should be accessed using READ_ONCE. Updates
      to both fields should use WRITE_ONCE.
      
      Update mpls_select_multipath (packet path) and mpls_ifdown and mpls_ifup
      (event handlers) accordingly.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39eb8cd1
    • P
      udp: use sk_protocol instead of pcflag to detect udplite sockets · 3d8417d7
      Paolo Abeni 提交于
      In the udp_sock struct, the 'forward_deficit' and 'pcflag' fields
      share the same cacheline. While the first is dirtied by
      udp_recvmsg, the latter is read, possibly several times, by the
      bottom half processing to discriminate between udp and udplite
      sockets.
      
      With this patch, sk->sk_protocol is used to check is the socket is
      really an udplite one, avoiding some cache misses per
      packet and improving the performance under udp_flood with
      small packet up to 10%.
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3d8417d7
    • T
      net: dsa: fix build error with devlink build as module · 768bfa2a
      Tobias Regnery 提交于
      After commit 96567d5d ("net: dsa: dsa2: Add basic support of devlink")
      I see the following link error with CONFIG_NET_DSA=y and CONFIG_NET_DEVLINK=m:
      
      net/built-in.o: In function 'dsa_register_switch':
      (.text+0xe226b): undefined reference to `devlink_alloc'
      net/built-in.o: In function 'dsa_register_switch':
      (.text+0xe2284): undefined reference to `devlink_register'
      net/built-in.o: In function 'dsa_register_switch':
      (.text+0xe243e): undefined reference to `devlink_port_register'
      net/built-in.o: In function 'dsa_register_switch':
      (.text+0xe24e1): undefined reference to `devlink_port_register'
      net/built-in.o: In function 'dsa_register_switch':
      (.text+0xe24fa): undefined reference to `devlink_port_type_eth_set'
      net/built-in.o: In function 'dsa_dst_unapply.part.8':
      dsa2.c:(.text.unlikely+0x345): undefined reference to 'devlink_port_unregister'
      dsa2.c:(.text.unlikely+0x36c): undefined reference to 'devlink_port_unregister'
      dsa2.c:(.text.unlikely+0x38e): undefined reference to 'devlink_port_unregister'
      dsa2.c:(.text.unlikely+0x3f2): undefined reference to 'devlink_unregister'
      dsa2.c:(.text.unlikely+0x3fb): undefined reference to 'devlink_free'
      
      Fix this by adding a dependency on MAY_USE_DEVLINK so that CONFIG_NET_DSA
      get switched to be build as module when CONFIG_NET_DEVLINK=m.
      
      Fixes: 96567d5d ("net: dsa: dsa2: Add basic support of devlink")
      Signed-off-by: NTobias Regnery <tobias.regnery@gmail.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      768bfa2a
    • D
      Merge branch 'phylib-EEE-updates' · 88f913f5
      David S. Miller 提交于
      Russell King says:
      
      ====================
      phylib EEE updates
      
      This series of patches depends on the previous set of changes, and is
      therefore net-next material.
      
      While testing the EEE code, I discovered a number of issues:
      
      1. It is possible to enable advertisment of EEE modes which are not
         supported by the hardware.  We omit to check the supported modes
         and mask off those modes that are not supported before writing the
         EEE advertisment register.
      
      2. We need to restart autonegotiation after a change of the EEE
         advertisment, otherwise the link partner does not see the updated
         EEE modes.
      
      3. SGMII connected PHYs are also capable of supporting EEE.
      
      Through discussion with Florian, it has been decided to remove the check
      for the PHY interface mode in patch (3).
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      88f913f5
    • R
      net: phy: allow EEE with any interface mode · 32d75141
      Russell King 提交于
      EEE is able to work in any PHY interface mode, there is nothing which
      fundamentally restricts it to only a few modes.  For example, EEE works
      in SGMII mode with the Marvell 88E1512.
      
      Rather than just adding SGMII mode to the list, Florian suggests
      removing the list of interface modes entirely:
      
        It actually sounds like we should just kill the check entirely,
        it does not appear that any of the interface mode would not
        fundamentally be able to support EEE, because the "lowest" mode
        we support is MII, and even there it's quite possible to support
        EEE.
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      32d75141
    • R
      net: phy: restart phy autonegotiation after EEE advertisment change · f75abeb8
      Russell King 提交于
      When the EEE advertisment is changed, we should restart autonegotiation
      to update the link partner with the new EEE settings.  Add this trigger
      but only if the advertisment has changed.
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f75abeb8
    • R
      net: phy: avoid setting unsupported EEE advertisments · 83ea067f
      Russell King 提交于
      We currently allow userspace to set any EEE advertisments it desires,
      whether or not the PHY supports them.  For example:
      
       # ethtool --set-eee eth1 advertise 0xffffffff
       # ethtool --show-eee eth1
       EEE Settings for eth1:
              EEE status: disabled
              Tx LPI: disabled
              Supported EEE link modes:  100baseT/Full
                                         1000baseT/Full
                                         10000baseT/Full
              Advertised EEE link modes:  100baseT/Full
                                          1000baseT/Full
                                          1000baseKX/Full
                                          10000baseT/Full
                                          10000baseKX4/Full
                                          10000baseKR/Full
      
      Clearly, this is not sane, we should only allow link modes that are
      supported to be advertised (as we do elsewhere.)  Ensure that we mask
      the MDIO_AN_EEE_ADV value with the capabilities retrieved from the
      MDIO_PCS_EEE_ABLE register.
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83ea067f
    • D
      Merge branch 'bpf-prog-testing-framework' · eefe06e8
      David S. Miller 提交于
      Alexei Starovoitov says:
      
      ====================
      bpf: program testing framework
      
      Development and testing of networking bpf programs is quite cumbersome.
      Especially tricky are XDP programs that attach to real netdevices and
      program development feels like working on the car engine while
      the car is in motion.
      Another problem is ongoing changes to upstream llvm core
      that can introduce an optimization that verifier will not
      recognize. llvm bpf backend tests have no ability to run the programs.
      To improve this situation introduce BPF_PROG_TEST_RUN command
      to test and performance benchmark bpf programs.
      It achieves several goals:
      - development of xdp and skb based bpf programs can be done
      in a canned environment with unit tests
      - program performance optimizations can be benchmarked outside of
      networking core (without driver and skb costs)
      - continuous testing of upstream changes is finally practical
      
      Patches 4,5,6 add C based test cases of various complexity
      to cover some sched_cls and xdp features. More tests will
      be added in the future. The tests were run on centos7 only.
      
      For now the framework supports only skb and xdp programs. In the future
      it can be extended to socket_filter and tracing program types.
      
      More details are in individual patches.
      
      v1->v2:
      - rename bpf_program_test_run->bpf_prog_test_run
      - add missing #include <linux/bpf.h> since libbpf.h shouldn't depend
      on prior includes
      - reordered patches 3 and 4 to keep bisect clean
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eefe06e8
    • A
      selftests/bpf: add l4 load balancer test based on sched_cls · 37821613
      Alexei Starovoitov 提交于
      this l4lb demo is a comprehensive test case for LLVM codegen and
      kernel verifier. It's using fully inlined jhash(), complex packet
      parsing and multiple map lookups of different types to stress
      llvm and verifier.
      The map sizes, map population and test vectors are artificial to
      exercise different paths through the bpf program.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37821613