1. 10 2月, 2017 30 次提交
  2. 09 2月, 2017 10 次提交
    • I
      mlxsw: spectrum_router: Don't reflect LINKDOWN nexthops · df6dd79b
      Ido Schimmel 提交于
      The kernel resolves the nexthops for a given route using
      FIB_LOOKUP_IGNORE_LINKSTATE which means a notification can be sent for a
      route with one of its nexthops being LINKDOWN.
      
      In case IGNORE_ROUTES_WITH_LINKDOWN is set for the nexthop netdev, then
      we shouldn't reflect the nexthop to the device's table.
      
      Once the nexthop netdev's carrier goes up we'll be notified using NH_ADD
      and reflect it to the device.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df6dd79b
    • D
      Merge branch 'mlxsw-Reflect-nexthop-status-changes' · d9e1661d
      David S. Miller 提交于
      Jiri Pirko says:
      
      ====================
      mlxsw: Reflect nexthop status changes
      
      Ido says:
      
      When the kernel forwards IPv4 packets via multipath routes it doesn't
      consider nexthops that are dead or linkdown. For example, if the nexthop
      netdev is administratively down or doesn't have a carrier.
      
      Devices capable of offloading such multipath routes need to be made
      aware of changes in the reflected nexthops' status. Otherwise, the
      device might forward packets via non-functional nexthops, resulting in
      packet loss. This patchset aims to fix that.
      
      The first 11 patches deal with the necessary restructuring in the
      mlxsw driver, so that it's able to correctly add and remove nexthops
      from the device's adjacency table.
      
      The 12th patch adds the NH_{ADD,DEL} events to the FIB notification
      chain. These notifications are sent whenever the kernel decides to add
      or remove a nexthop from the forwarding plane.
      
      Finally, the last three patches add support for these events in the
      mlxsw driver, which is currently the only driver capable of offloading
      multipath routes.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d9e1661d
    • I
      mlxsw: spectrum_router: Flush resources when RIF is deleted · 9665b745
      Ido Schimmel 提交于
      When the last IP address is removed from a netdev, its RIF is deleted.
      However, if user didn't first remove neighbours and nexthops using this
      interface, then they would still be present in the device's tables.
      
      Therefore, whenever a RIF is deleted, make sure all the neighbours and
      nexthops (adjacency entries) using it are removed from the relevant
      tables as well.
      
      The action associated with any route using this RIF would be refreshed,
      most likely to trap. If the kernel decides to remove the route (f.e.,
      because all the nexthops are now DEAD), then an event would be sent,
      causing the route to be removed from the device.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9665b745
    • I
      mlxsw: spectrum_router: Reflect nexthop status changes · ad178c8e
      Ido Schimmel 提交于
      When a packet hits a multipath route in the device's routing table, a
      hash is computed over its headers, which is then used to select the
      appropriate nexthop from the device's adjacency table.
      
      There are situations in which the kernel removes a nexthop from a
      multipath route (e.g., no carrier) and the device should do the same.
      
      Upon the reception of NH_{ADD,DEL} events, add or remove a nexthop from
      the device's adjacency table and refresh all the routes using the
      nexthop group. If all the nexthops of a multipath route are invalid,
      then any packet hitting the route would be trapped to the CPU for
      forwarding.
      
      If all the nexthops are DEAD, then the kernel would remove the route
      entirely. On the other hand, if all the nexthops are merely LINKDOWN,
      then the kernel would keep the route and forward any incoming packet
      using a different route.
      
      While the last case might sound like a problem, it's expected that a
      routing daemon running in user space would remove such a route from the
      FIB as it's dumped with the DEAD flag set.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ad178c8e
    • I
      ipv4: fib: Notify about nexthop status changes · 982acb97
      Ido Schimmel 提交于
      When a multipath route is hit the kernel doesn't consider nexthops that
      are DEAD or LINKDOWN when IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN is set.
      Devices that offload multipath routes need to be made aware of nexthop
      status changes. Otherwise, the device will keep forwarding packets to
      non-functional nexthops.
      
      Add the FIB_EVENT_NH_{ADD,DEL} events to the fib notification chain,
      which notify capable devices when they should add or delete a nexthop
      from their tables.
      
      Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
      Cc: David Ahern <dsa@cumulusnetworks.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Reviewed-by: NAndy Gospodarek <gospo@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      982acb97
    • I
      mlxsw: spectrum_router: Use trap action only for some route types · 70ad3506
      Ido Schimmel 提交于
      The device can have one of three actions associated with a route:
      
      1) Remote - packets continue to the adjacency table
      2) Local - packets continue to the neighbour table
      3) Trap - packets continue to the CPU
      
      The first two actions can also trap packets to the CPU, but they do so
      using a different trap ID, which has a lower traffic class and less
      allotted bandwidth.
      
      We currently use the third action for both RTN_{LOCAL,BROADCAST} routes
      and RTN_UNICAST routes not pointing to the switch ports.
      
      However, packets that merely need to be forwarded by the switch are
      likely not control packets and can be therefore scheduled towards the
      CPU using a lower traffic class.
      
      Achieve the above by assigning the third action only to local and
      broadcast routes and have any other route use either of the first two
      actions, based on whether the route is gatewayed or not.
      
      This will also allow us to refresh routes using the local action and
      have them trap packets when their RIF is no longer valid following a
      NH_DEL event.
      
      One side effect of this patch is that we no longer give special
      treatment to multipath routes using both switch and non-switch ports
      towards their nexthops. If at least one of the nexthops can be resolved,
      then the device will forward the packets instead of trapping them.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      70ad3506
    • I
      mlxsw: spectrum_router: Determine offload status using generic function · 4b411477
      Ido Schimmel 提交于
      The previous patch introduced a generic function to determine whether a
      route should be offloaded or not. Make use of it here.
      
      In the future we're going to add more conditions to this test (e.g.,
      whether TOS is non-zero), so it makes sense to centralize it instead of
      open coding it in a few places.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4b411477
    • I
      mlxsw: spectrum_router: More accurately set offload flag · 013b20f9
      Ido Schimmel 提交于
      We currently set the RTNH_F_OFFLOAD flag for all routes using remote
      action, but this isn't always correct. If none of the nexthops
      associated with a gatewayed route can be offloaded into the device, then
      any packet hitting it would be trapped to the CPU and forwarded by the
      kernel.
      
      Solve this by pushing the setting of the offload flag to after the route
      was programmed into the device, thereby allowing us to take all the
      parameters into account.
      
      This change will also help us further in the patchset, when we refresh
      routes following the reception of NH_{ADD,DEL} events.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      013b20f9
    • I
      mlxsw: spectrum_router: Refactor nexthop init routine · a8c97014
      Ido Schimmel 提交于
      The nexthop init and de-init functions both have symmetric parts
      concerned with the reflection of the neighbour entry into the device's
      adjacency table, in case it's used by a gatewayed route.
      
      These sections of code also need to be called when a nexthop is marked
      as valid / invalid following NH_{ADD,DEL} events. Break these out into
      appropriate functions, so that they could be invoked following the
      reception of above events.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a8c97014
    • I
      mlxsw: spectrum_router: Remove FIB info from FIB entry struct · c8b03077
      Ido Schimmel 提交于
      After the previous changes, the FIB info is embedded in every nexthop
      group struct, which in turn is embedded in every FIB entry struct.
      
      We can therefore safely remove the FIB info from the entry struct. This
      has the added advantage of making the router-related structs more
      generic and suitable for use with IPv6 offloads.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8b03077