1. 21 12月, 2017 11 次提交
  2. 20 12月, 2017 26 次提交
    • M
      net/mlx5: Stay in polling mode when command EQ destroy fails · a2fba188
      Moshe Shemesh 提交于
      During unload, on mlx5_stop_eqs we move command interface from events
      mode to polling mode, but if command interface EQ destroy fail we move
      back to events mode.
      That's wrong since even if we fail to destroy command interface EQ, we
      do release its irq, so no interrupts will be received.
      
      Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      a2fba188
    • M
      net/mlx5: Cleanup IRQs in case of unload failure · d6b2785c
      Moshe Shemesh 提交于
      When mlx5_stop_eqs fails to destroy any of the eqs it returns with an error.
      In such failure flow the function will return without
      releasing all EQs irqs and then pci_free_irq_vectors will fail.
      Fix by only warn on destroy EQ failure and continue to release other
      EQs and their irqs.
      
      It fixes the following kernel trace:
      kernel: kernel BUG at drivers/pci/msi.c:352!
      ...
      ...
      kernel: Call Trace:
      kernel: pci_disable_msix+0xd3/0x100
      kernel: pci_free_irq_vectors+0xe/0x20
      kernel: mlx5_load_one.isra.17+0x9f5/0xec0 [mlx5_core]
      
      Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      d6b2785c
    • M
      net/mlx5: Fix steering memory leak · 139ed6c6
      Maor Gottlieb 提交于
      Flow steering priority and namespace are software only objects that
      didn't have the proper destructors and were not freed during steering
      cleanup.
      
      Fix it by adding destructor functions for these objects.
      
      Fixes: bd71b08e ("net/mlx5: Support multiple updates of steering rules in parallel")
      Signed-off-by: NMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      139ed6c6
    • G
      net/mlx5e: Prevent possible races in VXLAN control flow · 0c1cc8b2
      Gal Pressman 提交于
      When calling add/remove VXLAN port, a lock must be held in order to
      prevent race scenarios when more than one add/remove happens at the
      same time.
      Fix by holding our state_lock (mutex) as done by all other parts of the
      driver.
      Note that the spinlock protecting the radix-tree is still needed in
      order to synchronize radix-tree access from softirq context.
      
      Fixes: b3f63c3d ("net/mlx5e: Add netdev support for VXLAN tunneling")
      Signed-off-by: NGal Pressman <galp@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      0c1cc8b2
    • G
      net/mlx5e: Add refcount to VXLAN structure · 23f4cc2c
      Gal Pressman 提交于
      A refcount mechanism must be implemented in order to prevent unwanted
      scenarios such as:
      - Open an IPv4 VXLAN interface
      - Open an IPv6 VXLAN interface (different socket)
      - Remove one of the interfaces
      
      With current implementation, the UDP port will be removed from our VXLAN
      database and turn off the offloads for the other interface, which is
      still active.
      The reference count mechanism will only allow UDP port removals once all
      consumers are gone.
      
      Fixes: b3f63c3d ("net/mlx5e: Add netdev support for VXLAN tunneling")
      Signed-off-by: NGal Pressman <galp@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      23f4cc2c
    • G
      net/mlx5e: Fix possible deadlock of VXLAN lock · 63235141
      Gal Pressman 提交于
      mlx5e_vxlan_lookup_port is called both from mlx5e_add_vxlan_port (user
      context) and mlx5e_features_check (softirq), but the lock acquired does
      not disable bottom half and might result in deadlock. Fix it by simply
      replacing spin_lock() with spin_lock_bh().
      While at it, replace all unnecessary spin_lock_irq() to spin_lock_bh().
      
      lockdep's WARNING: inconsistent lock state
      [  654.028136] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      [  654.028229] swapper/5/0 [HC0[0]:SC1[9]:HE1:SE0] takes:
      [  654.028321]  (&(&vxlan_db->lock)->rlock){+.?.}, at: [<ffffffffa06e7f0e>] mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core]
      [  654.028528] {SOFTIRQ-ON-W} state was registered at:
      [  654.028607]   _raw_spin_lock+0x3c/0x70
      [  654.028689]   mlx5e_vxlan_lookup_port+0x1e/0x50 [mlx5_core]
      [  654.028794]   mlx5e_vxlan_add_port+0x2e/0x120 [mlx5_core]
      [  654.028878]   process_one_work+0x1e9/0x640
      [  654.028942]   worker_thread+0x4a/0x3f0
      [  654.029002]   kthread+0x141/0x180
      [  654.029056]   ret_from_fork+0x24/0x30
      [  654.029114] irq event stamp: 579088
      [  654.029174] hardirqs last  enabled at (579088): [<ffffffff818f475a>] ip6_finish_output2+0x49a/0x8c0
      [  654.029309] hardirqs last disabled at (579087): [<ffffffff818f470e>] ip6_finish_output2+0x44e/0x8c0
      [  654.029446] softirqs last  enabled at (579030): [<ffffffff810b3b3d>] irq_enter+0x6d/0x80
      [  654.029567] softirqs last disabled at (579031): [<ffffffff810b3c05>] irq_exit+0xb5/0xc0
      [  654.029684] other info that might help us debug this:
      [  654.029781]  Possible unsafe locking scenario:
      
      [  654.029868]        CPU0
      [  654.029908]        ----
      [  654.029947]   lock(&(&vxlan_db->lock)->rlock);
      [  654.030045]   <Interrupt>
      [  654.030090]     lock(&(&vxlan_db->lock)->rlock);
      [  654.030162]
       *** DEADLOCK ***
      
      Fixes: b3f63c3d ("net/mlx5e: Add netdev support for VXLAN tunneling")
      Signed-off-by: NGal Pressman <galp@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      63235141
    • M
      net/mlx5: Fix error flow in CREATE_QP command · dbff26e4
      Moni Shoua 提交于
      In error flow, when DESTROY_QP command should be executed, the wrong
      mailbox was set with data, not the one that is written to hardware,
      Fix that.
      
      Fixes: 09a7d9ec '{net,IB}/mlx5: QP/XRCD commands via mlx5 ifc'
      Signed-off-by: NMoni Shoua <monis@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      dbff26e4
    • E
      net/mlx5: Fix misspelling in the error message and comment · 777ec2b2
      Eugenia Emantayev 提交于
      Fix misspelling in word syndrome.
      
      Fixes: e126ba97 ("mlx5: Add driver for Mellanox Connect-IB adapters")
      Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      777ec2b2
    • E
      net/mlx5e: Fix defaulting RX ring size when not needed · 696a97cf
      Eugenia Emantayev 提交于
      Fixes the bug when turning on/off CQE compression mechanism
      resets the RX rings size to default value when it is not
      needed.
      
      Fixes: 2fc4bfb7 ("net/mlx5e: Dynamic RQ type infrastructure")
      Signed-off-by: NEugenia Emantayev <eugenia@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      696a97cf
    • G
      net/mlx5e: Fix features check of IPv6 traffic · 2989ad1e
      Gal Pressman 提交于
      The assumption that the next header field contains the transport
      protocol is wrong for IPv6 packets with extension headers.
      Instead, we should look the inner-most next header field in the buffer.
      This will fix TSO offload for tunnels over IPv6 with extension headers.
      
      Performance testing: 19.25x improvement, cool!
      Measuring bandwidth of 16 threads TCP traffic over IPv6 GRE tap.
      CPU: Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
      NIC: Mellanox Technologies MT28800 Family [ConnectX-5 Ex]
      TSO: Enabled
      Before: 4,926.24  Mbps
      Now   : 94,827.91 Mbps
      
      Fixes: b3f63c3d ("net/mlx5e: Add netdev support for VXLAN tunneling")
      Signed-off-by: NGal Pressman <galp@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      2989ad1e
    • H
      net/mlx5e: Fix ETS BW check · ff089191
      Huy Nguyen 提交于
      Fix bug that allows ets bw sum to be 0% when ets tc type exists.
      
      Fixes: 08fb1dac ('net/mlx5e: Support DCBNL IEEE ETS')
      Signed-off-by: NMoshe Shemesh <moshe@mellanox.com>
      Reviewed-by: NHuy Nguyen <huyn@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      ff089191
    • E
      net/mlx5: Fix rate limit packet pacing naming and struct · 37e92a9d
      Eran Ben Elisha 提交于
      In mlx5_ifc, struct size was not complete, and thus driver was sending
      garbage after the last defined field. Fixed it by adding reserved field
      to complete the struct size.
      
      In addition, rename all set_rate_limit to set_pp_rate_limit to be
      compliant with the Firmware <-> Driver definition.
      
      Fixes: 7486216b ("{net,IB}/mlx5: mlx5_ifc updates")
      Fixes: 1466cc5b ("net/mlx5: Rate limit tables support")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      37e92a9d
    • S
      Revert "mlx5: move affinity hints assignments to generic code" · 231243c8
      Saeed Mahameed 提交于
      Before the offending commit, mlx5 core did the IRQ affinity itself,
      and it seems that the new generic code have some drawbacks and one
      of them is the lack for user ability to modify irq affinity after
      the initial affinity values got assigned.
      
      The issue is still being discussed and a solution in the new generic code
      is required, until then we need to revert this patch.
      
      This fixes the following issue:
      echo <new affinity> > /proc/irq/<x>/smp_affinity
      fails with  -EIO
      
      This reverts commit a435393a.
      Note: kept mlx5_get_vector_affinity in include/linux/mlx5/driver.h since
      it is used in mlx5_ib driver.
      
      Fixes: a435393a ("mlx5: move affinity hints assignments to generic code")
      Cc: Sagi Grimberg <sagi@grimberg.me>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Jes Sorensen <jsorensen@fb.com>
      Reported-by: NJes Sorensen <jsorensen@fb.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      231243c8
    • K
      net/mlx5: FPGA, return -EINVAL if size is zero · bae115a2
      Kamal Heib 提交于
      Currently, if a size of zero is passed to
      mlx5_fpga_mem_{read|write}_i2c()
      the "err" return value will not be initialized, which triggers gcc
      warnings:
      
      [..]/mlx5/core/fpga/sdk.c:87 mlx5_fpga_mem_read_i2c() error:
      uninitialized symbol 'err'.
      [..]/mlx5/core/fpga/sdk.c:115 mlx5_fpga_mem_write_i2c() error:
      uninitialized symbol 'err'.
      
      fix that.
      
      Fixes: a9956d35 ('net/mlx5: FPGA, Add SBU infrastructure')
      Signed-off-by: NKamal Heib <kamalh@mellanox.com>
      Reviewed-by: NYevgeny Kliteynik <kliteyn@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      bae115a2
    • P
      ipv4: fib: Fix metrics match when deleting a route · d03a4557
      Phil Sutter 提交于
      The recently added fib_metrics_match() causes a regression for routes
      with both RTAX_FEATURES and RTAX_CC_ALGO if the latter has
      TCP_CONG_NEEDS_ECN flag set:
      
      | # ip link add d0 type dummy
      | # ip link set d0 up
      | # ip route add 172.29.29.0/24 dev d0 features ecn congctl dctcp
      | # ip route del 172.29.29.0/24 dev d0 features ecn congctl dctcp
      | RTNETLINK answers: No such process
      
      During route insertion, fib_convert_metrics() detects that the given CC
      algo requires ECN and hence sets DST_FEATURE_ECN_CA bit in
      RTAX_FEATURES.
      
      During route deletion though, fib_metrics_match() compares stored
      RTAX_FEATURES value with that from userspace (which obviously has no
      knowledge about DST_FEATURE_ECN_CA) and fails.
      
      Fixes: 5f9ae3d9 ("ipv4: do metrics match when looking up and deleting a route")
      Signed-off-by: NPhil Sutter <phil@nwl.cc>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d03a4557
    • F
      net: stmmac: Fix bad RX timestamp extraction · a1762456
      Fredrik Hallenberg 提交于
      As noted in dwmac4_wrback_get_rx_timestamp_status the timestamp is found
      in the context descriptor following the current descriptor. However the
      current code looks for the context descriptor in the current
      descriptor, which will always fail.
      Signed-off-by: NFredrik Hallenberg <megahallon@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a1762456
    • F
      net: stmmac: Fix TX timestamp calculation · 200922c9
      Fredrik Hallenberg 提交于
      When using GMAC4 the value written in PTP_SSIR should be shifted however
      the shifted value is also used in subsequent calculations which results
      in a bad timestamp value.
      Signed-off-by: NFredrik Hallenberg <megahallon@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      200922c9
    • J
      tipc: fix list sorting bug in function tipc_group_update_member() · 3db09601
      Jon Maloy 提交于
      When, during a join operation, or during message transmission, a group
      member needs to be added to the group's 'congested' list, we sort it
      into the list in ascending order, according to its current advertised
      window size. However, we miss the case when the member is already on
      that list. This will have the result that the member, after the window
      size has been decremented, might be at the wrong position in that list.
      This again may have the effect that we during broadcast and multicast
      transmissions miss the fact that a destination is not yet ready for
      reception, and we end up sending anyway. From this point on, the
      behavior during the remaining session is unpredictable, e.g., with
      underflowing window sizes.
      
      We now correct this bug by unconditionally removing the member from
      the list before (re-)sorting it in.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3db09601
    • X
      ip6_tunnel: get the min mtu properly in ip6_tnl_xmit · c9fefa08
      Xin Long 提交于
      Now it's using IPV6_MIN_MTU as the min mtu in ip6_tnl_xmit, but
      IPV6_MIN_MTU actually only works when the inner packet is ipv6.
      
      With IPV6_MIN_MTU for ipv4 packets, the new pmtu for inner dst
      couldn't be set less than 1280. It would cause tx_err and the
      packet to be dropped when the outer dst pmtu is close to 1280.
      
      Jianlin found it by running ipv4 traffic with the topo:
      
        (client) gre6 <---> eth1 (route) eth2 <---> gre6 (server)
      
      After changing eth2 mtu to 1300, the performance became very
      low, or the connection was even broken. The issue also affects
      ip4ip6 and ip6ip6 tunnels.
      
      So if the inner packet is ipv4, 576 should be considered as the
      min mtu.
      
      Note that for ip4ip6 and ip6ip6 tunnels, the inner packet can
      only be ipv4 or ipv6, but for gre6 tunnel, it may also be ARP.
      This patch using 576 as the min mtu for non-ipv6 packet works
      for all those cases.
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c9fefa08
    • X
      ip6_gre: remove the incorrect mtu limit for ipgre tap · 2c52129a
      Xin Long 提交于
      The same fix as the patch "ip_gre: remove the incorrect mtu limit for
      ipgre tap" is also needed for ip6_gre.
      
      Fixes: 61e84623 ("net: centralize net_device min/max MTU checking")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c52129a
    • X
      ip_gre: remove the incorrect mtu limit for ipgre tap · cfddd4c3
      Xin Long 提交于
      ipgre tap driver calls ether_setup(), after commit 61e84623
      ("net: centralize net_device min/max MTU checking"), the range
      of mtu is [min_mtu, max_mtu], which is [68, 1500] by default.
      
      It causes the dev mtu of the ipgre tap device to not be greater
      than 1500, this limit value is not correct for ipgre tap device.
      
      Besides, it's .change_mtu already does the right check. So this
      patch is just to set max_mtu as 0, and leave the check to it's
      .change_mtu.
      
      Fixes: 61e84623 ("net: centralize net_device min/max MTU checking")
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cfddd4c3
    • X
      vxlan: update skb dst pmtu on tx path · a93bf0ff
      Xin Long 提交于
      Unlike ip tunnels, now vxlan doesn't do any pmtu update for
      upper dst pmtu, even if it doesn't match the lower dst pmtu
      any more.
      
      The problem can be reproduced when reducing the vxlan lower
      dev's pmtu when running netperf. In jianlin's testing, the
      performance went to 1/7 of the previous.
      
      This patch is to update the upper dst pmtu to match the lower
      dst pmtu on tx path so that packets can be sent out even when
      lower dev's pmtu has been changed.
      
      It also works for metadata dst.
      
      Note that this patch doesn't process any pmtu icmp packet.
      But even in the future, the support for pmtu icmp packets
      process of udp tunnels will also needs this.
      
      The same thing will be done for geneve in another patch.
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a93bf0ff
    • A
      net: arc_emac: restart stalled EMAC · 78aa0975
      Alexander Kochetkov 提交于
      Under certain conditions EMAC stop reception of incoming packets and
      continuously increment R_MISS register instead of saving data into
      provided buffer. The commit implement workaround for such situation.
      Then the stall detected EMAC will be restarted.
      
      On device the stall looks like the device lost it's dynamic IP address.
      ifconfig shows that interface error counter rapidly increments.
      At the same time on the DHCP server we can see continues DHCP-requests
      from device.
      
      In real network stalls happen really rarely. To make them frequent the
      broadcast storm[1] should be simulated. For simulation it is necessary
      to make following connections:
          1. connect radxarock to 1st port of switch
          2. connect some PC to 2nd port of switch
          3. connect two other free ports together using standard ethernet cable,
             in order to make a switching loop.
      
      After that, is necessary to make a broadcast storm. For example, running on
      PC 'ping' to some IP address triggers ARP-request storm. After some
      time (~10sec), EMAC on rk3188 will stall.
      
      Observed and tested on rk3188 radxarock.
      
      [1] https://en.wikipedia.org/wiki/Broadcast_radiationSigned-off-by: NAlexander Kochetkov <al.kochet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      78aa0975
    • A
      net: arc_emac: fix arc_emac_rx() error paths · e688822d
      Alexander Kochetkov 提交于
      arc_emac_rx() has some issues found by code review.
      
      In case netdev_alloc_skb_ip_align() or dma_map_single() failure
      rx fifo entry will not be returned to EMAC.
      
      In case dma_map_single() failure previously allocated skb became
      lost to driver. At the same time address of newly allocated skb
      will not be provided to EMAC.
      Signed-off-by: NAlexander Kochetkov <al.kochet@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e688822d
    • S
      net: mediatek: setup proper state for disabled GMAC on the default · 7352e252
      Sean Wang 提交于
      The current solution would setup fixed and force link of 1Gbps to the both
      GMAC on the default. However, The GMAC should always be put to link down
      state when the GMAC is disabled on certain target boards. Otherwise,
      the driver possibly receives unexpected data from the floating hardware
      connection through the unused GMAC. Although the driver had been added
      certain protection in RX path to get rid of such kind of unexpected data
      sent to the upper stack.
      Signed-off-by: NSean Wang <sean.wang@mediatek.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7352e252
    • P
      mlxsw: spectrum_router: Remove batch neighbour deletion causing FW bug · 8ba6b30e
      Petr Machata 提交于
      This reverts commit 63dd00fa.
      
      RAUHT DELETE_ALL seems to trigger a bug in FW. That manifests by later
      calls to RAUHT ADD of an IPv6 neighbor to fail with "bad parameter"
      error code.
      Signed-off-by: NPetr Machata <petrm@mellanox.com>
      Fixes: 63dd00fa ("mlxsw: spectrum_router: Add batch neighbour deletion")
      Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8ba6b30e
  3. 19 12月, 2017 3 次提交