1. 29 8月, 2016 5 次提交
    • E
      net/mlx5e: Fix ethtool -g/G rx ring parameter report with striding RQ · cc8e9ebf
      Eran Ben Elisha 提交于
      The driver RQ has two possible configurations: striding RQ and
      non-striding RQ.  Until this patch, the driver always reported the
      number of hardware WQEs (ring descriptors). For non striding RQ
      configuration, this was OK since we have one WQE per pending packet
      For striding RQ, multiple packets can fit into one WQE. For better
      user experience we normalize the rx_pending parameter (size of wqe/mtu)
      as the average ring size in case of striding RQ.
      
      Fixes: 461017cb ('net/mlx5e: Support RX multi-packet WQE ...')
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc8e9ebf
    • S
      net/mlx5e: Don't wait for SQ completions on close · 6e8dd6d6
      Saeed Mahameed 提交于
      Instead of asking the firmware to flush the SQ (Send Queue) via
      asynchronous completions when moved to error, we handle SQ flush
      manually (mlx5e_free_tx_descs) same as we did when SQ flush got
      timed out or on tx_timeout.
      
      This will reduce SQs flush time and speedup interface down procedure.
      
      Moved mlx5e_free_tx_descs to the end of en_tx.c for tx
      critical code locality.
      
      Fixes: 29429f33 ('net/mlx5e: Timeout if SQ doesn't flush during close')
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e8dd6d6
    • S
      net/mlx5e: Don't post fragmented MPWQE when RQ is disabled · 8484f9ed
      Saeed Mahameed 提交于
      ICO (Internal control operations) SQ (Send Queue) is closed/disabled
      after RQ (Receive Queue).  After RQ is closed an ICO SQ completion
      might post a fragmented MPWQE (Multi Packet Work Queue Element) into
      that RQ.
      
      As on regular RQ post, check if we are allowed to post to that
      RQ (RQ is enabled). Cleanup in-progress UMR MPWQE on mlx5e_free_rx_descs
      if needed.
      
      Fixes: bc77b240 ('net/mlx5e: Add fragmented memory support for RX multi packet WQE')
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8484f9ed
    • S
      net/mlx5e: Don't wait for RQ completions on close · f2fde18c
      Saeed Mahameed 提交于
      This will significantly reduce receive queue flush time on interface
      down.
      
      Instead of asking the firmware to flush the RQ (Receive Queue) via
      asynchronous completions when moved to error, we handle RQ flush
      manually (mlx5e_free_rx_descs) same as we did when RQ flush got timed
      out.
      
      This will reduce RQs flush time and speedup interface down procedure
      (ifconfig down) from 6 sec to 0.3 sec on a 48 cores system.
      
      Moved mlx5e_free_rx_descs en_main.c where it is needed, to keep en_rx.c
      free form non critical data path code for better code locality.
      
      Fixes: 6cd392a0 ('net/mlx5e: Handle RQ flush in error cases')
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f2fde18c
    • S
      net/mlx5e: Limit UMR length to the device's limitation · fe4c988b
      Saeed Mahameed 提交于
      ConnectX-4 UMR (User Memory Region) MTT translation table offset in WQE
      is limited to U16_MAX, before this patch we ignored that limitation and
      requested the maximum possible UMR translation length that the netdev
      might need (MAX channels * MAX pages per channel).
      In case of a system with #cores > 32 and when linear WQE allocation fails,
      falling back to using UMR WQEs will cause the RQ (Receive Queue) to get
      stuck.
      
      Here we limit UMR length to min(U16_MAX, max required pages) (while
      considering the required alignments) on driver load, by default U16_MAX is
      sufficient since the default RX rings value guarantees that we are in
      range, dynamically (on set_ringparam/set_channels) we will check if the
      new required UMR length (num mtts) is still in range, if not, fail the
      request.
      
      Fixes: bc77b240 ('net/mlx5e: Add fragmented memory support for RX multi packet WQE')
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe4c988b
  2. 27 8月, 2016 5 次提交
  3. 26 8月, 2016 8 次提交
  4. 25 8月, 2016 2 次提交
  5. 24 8月, 2016 14 次提交
  6. 23 8月, 2016 6 次提交
    • J
      net sched: fix encoding to use real length · 28a10c42
      Jamal Hadi Salim 提交于
      Encoding of the metadata was using the padded length as opposed to
      the real length of the data which is a bug per specification.
      This has not been an issue todate because all metadatum specified
      so far has been 32 bit where aligned and data length are the same width.
      This also includes a bug fix for validating the length of a u16 field.
      But since there is no metadata of size u16 yes we are fine to include it
      here.
      
      While at it get rid of magic numbers.
      
      Fixes: ef6980b6 ("net sched: introduce IFE action")
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      28a10c42
    • Y
      qed: FLR of active VFs might lead to FW assert · 4870e704
      Yuval Mintz 提交于
      Driver never bothered marking the VF's vport with the VF's sw_fid.
      As a result, FLR flows are not going to clean those vports.
      
      If the vport was active when FLRed, re-activating it would lead
      to a FW assertion.
      
      Fixes: dacd88d6 ("qed: IOV l2 functionality")
      Signed-off-by: NYuval Mintz <Yuval.Mintz@qlogic.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4870e704
    • S
      net: ip_finish_output_gso: Allow fragmenting segments of tunneled skbs if their DF is unset · c0451fe1
      Shmulik Ladkani 提交于
      In b8247f09,
      
         "net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for local udp tunneled skbs"
      
      gso skbs arriving from an ingress interface that go through UDP
      tunneling, are allowed to be fragmented if the resulting encapulated
      segments exceed the dst mtu of the egress interface.
      
      This aligned the behavior of gso skbs to non-gso skbs going through udp
      encapsulation path.
      
      However the non-gso vs gso anomaly is present also in the following
      cases of a GRE tunnel:
       - ip_gre in collect_md mode, where TUNNEL_DONT_FRAGMENT is not set
         (e.g. OvS vport-gre with df_default=false)
       - ip_gre in nopmtudisc mode, where IFLA_GRE_IGNORE_DF is set
      
      In both of the above cases, the non-gso skbs get fragmented, whereas the
      gso skbs (having skb_gso_network_seglen that exceeds dst mtu) get dropped,
      as they don't go through the segment+fragment code path.
      
      Fix: Setting IPSKB_FRAG_SEGS if the tunnel specified IP_DF bit is NOT set.
      
      Tunnels that do set IP_DF, will not go to fragmentation of segments.
      This preserves behavior of ip_gre in (the default) pmtudisc mode.
      
      Fixes: b8247f09 ("net: ip_finish_output_gso: If skb_gso_network_seglen exceeds MTU, allow segmentation for local udp tunneled skbs")
      Reported-by: Nwenxu <wenxu@ucloud.cn>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
      Tested-by: Nwenxu <wenxu@ucloud.cn>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0451fe1
    • M
      net: ipv6: Remove addresses for failures with strict DAD · 85b51b12
      Mike Manning 提交于
      If DAD fails with accept_dad set to 2, global addresses and host routes
      are incorrectly left in place. Even though disable_ipv6 is set,
      contrary to documentation, the addresses are not dynamically deleted
      from the interface. It is only on a subsequent link down/up that these
      are removed. The fix is not only to set the disable_ipv6 flag, but
      also to call addrconf_ifdown(), which is the action to carry out when
      disabling IPv6. This results in the addresses and routes being deleted
      immediately. The DAD failure for the LL addr is determined as before
      via netlink, or by the absence of the LL addr (which also previously
      would have had to be checked for in case of an intervening link down
      and up). As the call to addrconf_ifdown() requires an rtnl lock, the
      logic to disable IPv6 when DAD fails is moved to addrconf_dad_work().
      
      Previous behavior:
      
      root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
      net.ipv6.conf.eth3.accept_dad = 2
      root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
      root@vm1:/# ip link set up eth3
      root@vm1:/# ip -6 addr show dev eth3
      5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
          inet6 2000::10/64 scope global
             valid_lft forever preferred_lft forever
          inet6 fe80::5054:ff:fe43:dd5a/64 scope link tentative dadfailed
             valid_lft forever preferred_lft forever
      root@vm1:/# ip -6 route show dev eth3
      2000::/64  proto kernel  metric 256
      fe80::/64  proto kernel  metric 256
      root@vm1:/# ip link set down eth3
      root@vm1:/# ip link set up eth3
      root@vm1:/# ip -6 addr show dev eth3
      root@vm1:/# ip -6 route show dev eth3
      root@vm1:/#
      
      New behavior:
      
      root@vm1:/# sysctl net.ipv6.conf.eth3.accept_dad=2
      net.ipv6.conf.eth3.accept_dad = 2
      root@vm1:/# ip -6 addr add 2000::10/64 dev eth3
      root@vm1:/# ip link set up eth3
      root@vm1:/# ip -6 addr show dev eth3
      root@vm1:/# ip -6 route show dev eth3
      root@vm1:/#
      Signed-off-by: NMike Manning <mmanning@brocade.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85b51b12
    • M
      include/uapi/linux/ipx.h: fix conflicting defitions with glibc netipx/ipx.h · 53dc65d4
      Mikko Rapeli 提交于
      Fixes these compiler warnings via libc-compat.h when glibc netipx/ipx.h is
      included before linux/ipx.h:
      
      ./linux/ipx.h:9:8: error: redefinition of ‘struct sockaddr_ipx’
      ./linux/ipx.h:26:8: error: redefinition of ‘struct ipx_route_definition’
      ./linux/ipx.h:32:8: error: redefinition of ‘struct ipx_interface_definition’
      ./linux/ipx.h:49:8: error: redefinition of ‘struct ipx_config_data’
      ./linux/ipx.h:58:8: error: redefinition of ‘struct ipx_route_def’
      Signed-off-by: NMikko Rapeli <mikko.rapeli@iki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      53dc65d4
    • M
      include/uapi/linux/openvswitch.h: use __u32 from linux/types.h · a1d1f65f
      Mikko Rapeli 提交于
      Kernel uapi header are supposed to use them. Fixes userspace compile error:
      
      linux/openvswitch.h:583:2: error: unknown type name ‘uint32_t’
      Signed-off-by: NMikko Rapeli <mikko.rapeli@iki.fi>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a1d1f65f