1. 24 12月, 2016 1 次提交
  2. 04 12月, 2016 2 次提交
    • I
      ipv4: fib: Replay events when registering FIB notifier · c3852ef7
      Ido Schimmel 提交于
      Commit b90eb754 ("fib: introduce FIB notification infrastructure")
      introduced a new notification chain to notify listeners (f.e., switchdev
      drivers) about addition and deletion of routes.
      
      However, upon registration to the chain the FIB tables can already be
      populated, which means potential listeners will have an incomplete view
      of the tables.
      
      Solve that by dumping the FIB tables and replaying the events to the
      passed notification block. The dump itself is done using RCU in order
      not to starve consumers that need RTNL to make progress.
      
      The integrity of the dump is ensured by reading the FIB change sequence
      counter before and after the dump under RTNL. This allows us to avoid
      the problematic situation in which the dumping process sends a ENTRY_ADD
      notification following ENTRY_DEL generated by another process holding
      RTNL.
      
      Callers of the registration function may pass a callback that is
      executed in case the dump was inconsistent with current FIB tables.
      
      The number of retries until a consistent dump is achieved is set to a
      fixed number to prevent callers from looping for long periods of time.
      In case current limit proves to be problematic in the future, it can be
      easily converted to be configurable using a sysctl.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c3852ef7
    • I
      mlxsw: spectrum_router: Implement FIB offload in deferred work · 3057224e
      Ido Schimmel 提交于
      FIB offload is currently done in process context with RTNL held, but
      we're about to dump the FIB tables in RCU critical section, so we can no
      longer sleep.
      
      Instead, defer the operation to process context using deferred work. Make
      sure fib info isn't freed while the work is queued by taking a reference
      on it and releasing it after the operation is done.
      
      Deferring the operation is valid because the upper layers always assume
      the operation was successful. If it's not, then the driver-specific
      abort mechanism is called and all routed traffic is directed to slow
      path.
      
      The work items are submitted to an ordered workqueue to prevent a
      mismatch between the kernel's FIB table and the device's.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3057224e
  3. 17 11月, 2016 1 次提交
  4. 15 11月, 2016 1 次提交
    • I
      mlxsw: spectrum_router: Flush FIB tables during fini · ac571de9
      Ido Schimmel 提交于
      Since commit b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications
      instead of switchdev calls") we reflect to the device the entire FIB
      table and not only FIBs that point to netdevs created by the driver.
      
      During module removal, FIBs of the second type are removed following
      NETDEV_UNREGISTER events sent. The other FIBs are still present in both
      the driver's cache and the device's table.
      
      Fix this by iterating over all the FIB tables in the device and flush
      them. There's no need to take locks, as we're the only writer.
      
      Fixes: b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac571de9
  5. 14 11月, 2016 2 次提交
  6. 11 11月, 2016 2 次提交
    • J
      mlxsw: spectrum_router: Ignore FIB notification events for non-init namespaces · 0e3715c9
      Jiri Pirko 提交于
      Since now, the table with same id in multiple netnamespaces were squashed
      to a single virtual router. That is not only incorrect, it also causes
      error messages when trying to use RALUE register to do double remove
      of FIB entries, like this one:
      
      mlxsw_spectrum 0000:03:00.0: EMAD reg access failed (tid=facb831c00007b20,reg_id=8013(ralue),type=write,status=7(bad parameter))
      
      Since we don't allow ports to change namespaces (NETIF_F_NETNS_LOCAL),
      and the infrastructure is not yet prepared to handle netnamespaces, just
      ignore FIB notification events for non-init namespaces. That is clear to
      do since we don't need to offload them.
      
      Fixes: b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0e3715c9
    • J
      mlxsw: spectrum_router: Fix handling of neighbour structure · 33b1341c
      Jiri Pirko 提交于
      __neigh_create function works in a different way than assumed.
      It passes "n" as a parameter to ndo_neigh_construct. But this "n" might
      be destroyed right away before __neigh_create() returns in case there is
      already another neighbour struct in the hashtable with the same dev and
      primary key. That is not expected by mlxsw_sp_router_neigh_construct()
      and the stored "n" points to freed memory, eventually leading to crash.
      
      Fix this by doing tight 1:1 coupling between neighbour struct and
      internal driver neigh_entry. That allows to narrow down the key in
      internal driver hashtable to do lookups by "n" only.
      
      Fixes: 6cf3c971 ("mlxsw: spectrum_router: Add private neigh table")
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      33b1341c
  7. 29 10月, 2016 3 次提交
  8. 24 10月, 2016 1 次提交
  9. 20 10月, 2016 2 次提交
  10. 03 10月, 2016 1 次提交
    • A
      mlxsw: spectrum_router: avoid potential uninitialized data usage · ab580705
      Arnd Bergmann 提交于
      If fi->fib_nhs is zero, the router interface pointer is uninitialized, as shown by
      this warning:
      
      drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c: In function 'mlxsw_sp_router_fib_event':
      drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1674:21: error: 'r' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1643:23: note: 'r' was declared here
      
      This changes the loop so we handle the case the same way as finding no router
      interface pointer attached to one of the nexthops to ensure we always
      trap here instead of using uninitialized data.
      
      Fixes: b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ab580705
  11. 28 9月, 2016 1 次提交
  12. 21 9月, 2016 2 次提交
  13. 20 9月, 2016 1 次提交
    • I
      mlxsw: spectrum: Fix sparse warnings · 1a9234e6
      Ido Schimmel 提交于
      drivers/net/ethernet/mellanox/mlxsw//spectrum.c:251:28: warning: symbol
      'mlxsw_sp_span_entry_find' was not declared. Should it be static?
      drivers/net/ethernet/mellanox/mlxsw//spectrum.c:265:28: warning: symbol
      'mlxsw_sp_span_entry_get' was not declared. Should it be static?
      drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56: warning: mixing
      different enum types
      drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56:     int enum
      mlxsw_sp_span_type  versus
      drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56:     int enum
      mlxsw_reg_mpar_i_e
      ...
      drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32: warning:
      mixing different enum types
      drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32:     int
      enum mlxsw_reg_sbxx_dir  versus
      drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32:     int
      enum devlink_sb_pool_type
      drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39: warning:
      mixing different enum types
      drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39:     int
      enum mlxsw_reg_sbpr_mode  versus
      drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39:     int
      enum devlink_sb_threshold_type
      ...
      drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54: warning:
      mixing different enum types
      drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54:     int
      enum mlxsw_sp_l3proto  versus
      drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54:     int
      enum mlxsw_reg_ralxx_protocol
      ...
      drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1749:6: warning:
      symbol 'mlxsw_sp_fib_entry_put' was not declared. Should it be static?
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1a9234e6
  14. 10 9月, 2016 1 次提交
  15. 02 9月, 2016 3 次提交
  16. 25 8月, 2016 1 次提交
  17. 15 8月, 2016 1 次提交
  18. 15 7月, 2016 1 次提交
  19. 06 7月, 2016 6 次提交
  20. 05 7月, 2016 4 次提交
  21. 03 7月, 2016 1 次提交
  22. 02 9月, 2014 1 次提交
    • E
      tipc: add name distributor resiliency queue · a5325ae5
      Erik Hugne 提交于
      TIPC name table updates are distributed asynchronously in a cluster,
      entailing a risk of certain race conditions. E.g., if two nodes
      simultaneously issue conflicting (overlapping) publications, this may
      not be detected until both publications have reached a third node, in
      which case one of the publications will be silently dropped on that
      node. Hence, we end up with an inconsistent name table.
      
      In most cases this conflict is just a temporary race, e.g., one
      node is issuing a publication under the assumption that a previous,
      conflicting, publication has already been withdrawn by the other node.
      However, because of the (rtt related) distributed update delay, this
      may not yet hold true on all nodes. The symptom of this failure is a
      syslog message: "tipc: Cannot publish {%u,%u,%u}, overlap error".
      
      In this commit we add a resiliency queue at the receiving end of
      the name table distributor. When insertion of an arriving publication
      fails, we retain it in this queue for a short amount of time, assuming
      that another update will arrive very soon and clear the conflict. If so
      happens, we insert the publication, otherwise we drop it.
      
      The (configurable) retention value defaults to 2000 ms. Knowing from
      experience that the situation described above is extremely rare, there
      is no risk that the queue will accumulate any large number of items.
      Signed-off-by: NErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a5325ae5
  23. 18 6月, 2013 1 次提交
    • Y
      tipc: change socket buffer overflow control to respect sk_rcvbuf · cc79dd1b
      Ying Xue 提交于
      As per feedback from the netdev community, we change the buffer
      overflow protection algorithm in receiving sockets so that it
      always respects the nominal upper limit set in sk_rcvbuf.
      
      Instead of scaling up from a small sk_rcvbuf value, which leads to
      violation of the configured sk_rcvbuf limit, we now calculate the
      weighted per-message limit by scaling down from a much bigger value,
      still in the same field, according to the importance priority of the
      received message.
      
      To allow for administrative tunability of the socket receive buffer
      size, we create a tipc_rmem sysctl variable to allow the user to
      configure an even bigger value via sysctl command.  It is a size of
      three (min/default/max) to be consistent with things like tcp_rmem.
      
      By default, the value initialized in tipc_rmem[1] is equal to the
      receive socket size needed by a TIPC_CRITICAL_IMPORTANCE message.
      This value is also set as the default value of sk_rcvbuf.
      Originally-by: NJon Maloy <jon.maloy@ericsson.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      [Ying: added sysctl variation to Jon's original patch]
      Signed-off-by: NYing Xue <ying.xue@windriver.com>
      [PG: don't compile sysctl.c if not config'd; add Documentation]
      Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc79dd1b