- 24 12月, 2016 2 次提交
-
-
由 Ido Schimmel 提交于
At the end of the nexthop initialization process we determine whether the nexthop should be offloaded or not based on the NUD state of the neighbour representing it. After all the nexthops were initialized we refresh the nexthop group and potentially offload it to the device, in case some of the nexthops were resolved. Make the destruction of a nexthop group symmetric with its creation by marking all nexthops as invalid and then refresh the nexthop group to make sure it was removed from the device's tables. Fixes: b2157149 ("mlxsw: spectrum_router: Add the nexthop neigh activity update") Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
When a neighbour is considered to be dead, we should remove it from the device's table regardless of its NUD state. Without this patch, after setting a port to be administratively down we get the following errors when we periodically try to update the kernel about neighbours activity: [ 461.947268] mlxsw_spectrum 0000:03:00.0 sw1p3: Failed to find matching neighbour for IP=192.168.100.2 Fixes: a6bf9e93 ("mlxsw: spectrum_router: Offload neighbours based on NUD state change") Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 12月, 2016 2 次提交
-
-
由 Ido Schimmel 提交于
Commit b90eb754 ("fib: introduce FIB notification infrastructure") introduced a new notification chain to notify listeners (f.e., switchdev drivers) about addition and deletion of routes. However, upon registration to the chain the FIB tables can already be populated, which means potential listeners will have an incomplete view of the tables. Solve that by dumping the FIB tables and replaying the events to the passed notification block. The dump itself is done using RCU in order not to starve consumers that need RTNL to make progress. The integrity of the dump is ensured by reading the FIB change sequence counter before and after the dump under RTNL. This allows us to avoid the problematic situation in which the dumping process sends a ENTRY_ADD notification following ENTRY_DEL generated by another process holding RTNL. Callers of the registration function may pass a callback that is executed in case the dump was inconsistent with current FIB tables. The number of retries until a consistent dump is achieved is set to a fixed number to prevent callers from looping for long periods of time. In case current limit proves to be problematic in the future, it can be easily converted to be configurable using a sysctl. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
FIB offload is currently done in process context with RTNL held, but we're about to dump the FIB tables in RCU critical section, so we can no longer sleep. Instead, defer the operation to process context using deferred work. Make sure fib info isn't freed while the work is queued by taking a reference on it and releasing it after the operation is done. Deferring the operation is valid because the upper layers always assume the operation was successful. If it's not, then the driver-specific abort mechanism is called and all routed traffic is directed to slow path. The work items are submitted to an ordered workqueue to prevent a mismatch between the kernel's FIB table and the device's. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 11月, 2016 1 次提交
-
-
由 Ido Schimmel 提交于
The recent merge commit bb598c1b ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") would cause the FIB abort warning to fire whenever we flush the FIB tables - either during module removal or actual abort. Move it back to its rightful location in the FIB abort function. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 11月, 2016 1 次提交
-
-
由 Ido Schimmel 提交于
Since commit b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") we reflect to the device the entire FIB table and not only FIBs that point to netdevs created by the driver. During module removal, FIBs of the second type are removed following NETDEV_UNREGISTER events sent. The other FIBs are still present in both the driver's cache and the device's table. Fix this by iterating over all the FIB tables in the device and flush them. There's no need to take locks, as we're the only writer. Fixes: b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 11月, 2016 2 次提交
-
-
由 Jiri Pirko 提交于
Add a warning that the abort mechanism was triggered for device. Also avoid going through the procedure if abort was already done. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arkadi Sharshevsky 提交于
The device's neighbour table is periodically dumped in order to update the kernel about active neighbours. A single dump session may span multiple queries, until the response carries less records than requested or when a record (can contain up to four neighbour entries) is not full. Current code stops the session when the number of returned records is zero, which can result in infinite loop in case of high packet rate. Fix this by stopping the session according to the above logic. Fixes: c723c735 ("mlxsw: spectrum_router: Periodically update the kernel's neigh table") Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 11月, 2016 2 次提交
-
-
由 Jiri Pirko 提交于
Since now, the table with same id in multiple netnamespaces were squashed to a single virtual router. That is not only incorrect, it also causes error messages when trying to use RALUE register to do double remove of FIB entries, like this one: mlxsw_spectrum 0000:03:00.0: EMAD reg access failed (tid=facb831c00007b20,reg_id=8013(ralue),type=write,status=7(bad parameter)) Since we don't allow ports to change namespaces (NETIF_F_NETNS_LOCAL), and the infrastructure is not yet prepared to handle netnamespaces, just ignore FIB notification events for non-init namespaces. That is clear to do since we don't need to offload them. Fixes: b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
__neigh_create function works in a different way than assumed. It passes "n" as a parameter to ndo_neigh_construct. But this "n" might be destroyed right away before __neigh_create() returns in case there is already another neighbour struct in the hashtable with the same dev and primary key. That is not expected by mlxsw_sp_router_neigh_construct() and the stored "n" points to freed memory, eventually leading to crash. Fix this by doing tight 1:1 coupling between neighbour struct and internal driver neigh_entry. That allows to narrow down the key in internal driver hashtable to do lookups by "n" only. Fixes: 6cf3c971 ("mlxsw: spectrum_router: Add private neigh table") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 10月, 2016 3 次提交
-
-
由 Ido Schimmel 提交于
Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Reviewed-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Only trees which are in use should be compared to requested prefix usage. Fixes: 53342023 ("mlxsw: spectrum_router: Implement LPM trees management") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Currently, the prefix bitlist is not saved for LPM trees, causing the compare to always fail which causes the tree to be destroyed and created for every inserted and removed FIB entry. So fix this by saving the bitlist as it should have been done from the very beginning. Fixes: 53342023 ("mlxsw: spectrum_router: Implement LPM trees management") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 10月, 2016 1 次提交
-
-
由 Jiri Pirko 提交于
Since the number of resources is going to get much bigger, ease up the addition by simly defining IDs. Convert the existing structure members to a set array, one for validity, one for values. Introduce a set of getters and setters for easy access. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 10月, 2016 2 次提交
-
-
由 Jiri Pirko 提交于
The function return value is not checked anywhere. Also, the warning causes huge slowdown when removing large number of FIB entries which were not offloaded, because of ordering issue. Ido's preparing a patchset to fix the ordering issue, but that is definitelly not net tree material. Fixes: b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
By a mistake, there is tree index 0 passed to RALTB. Should be MLXSW_SP_LPM_TREE_MIN. Fixes: b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") Reported-by: NYotam Gigi <yotamg@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 10月, 2016 1 次提交
-
-
由 Arnd Bergmann 提交于
If fi->fib_nhs is zero, the router interface pointer is uninitialized, as shown by this warning: drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c: In function 'mlxsw_sp_router_fib_event': drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1674:21: error: 'r' may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1643:23: note: 'r' was declared here This changes the loop so we handle the case the same way as finding no router interface pointer attached to one of the nexthops to ensure we always trap here instead of using uninitialized data. Fixes: b45f64d1 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 9月, 2016 1 次提交
-
-
由 Jiri Pirko 提交于
Until now, in order to offload a FIB entry to HW we use switchdev op. However that has limits. Mainly in case we need to make the HW aware of all route prefixes configured in kernel. HW needs to know those in order to properly trap appropriate packets and pass the to kernel to do the forwarding. Abort mechanism is now handled within the mlxsw driver. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 9月, 2016 2 次提交
-
-
由 Nogah Frankel 提交于
Replace max rif const with using the result from resource query. Signed-off-by: NNogah Frankel <nogahf@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nogah Frankel 提交于
Replace max virtual routers const with the result from the resource query. Signed-off-by: NNogah Frankel <nogahf@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 9月, 2016 1 次提交
-
-
由 Ido Schimmel 提交于
drivers/net/ethernet/mellanox/mlxsw//spectrum.c:251:28: warning: symbol 'mlxsw_sp_span_entry_find' was not declared. Should it be static? drivers/net/ethernet/mellanox/mlxsw//spectrum.c:265:28: warning: symbol 'mlxsw_sp_span_entry_get' was not declared. Should it be static? drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56: warning: mixing different enum types drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56: int enum mlxsw_sp_span_type versus drivers/net/ethernet/mellanox/mlxsw//spectrum.c:367:56: int enum mlxsw_reg_mpar_i_e ... drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32: warning: mixing different enum types drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32: int enum mlxsw_reg_sbxx_dir versus drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:598:32: int enum devlink_sb_pool_type drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39: warning: mixing different enum types drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39: int enum mlxsw_reg_sbpr_mode versus drivers/net/ethernet/mellanox/mlxsw//spectrum_buffers.c:600:39: int enum devlink_sb_threshold_type ... drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54: warning: mixing different enum types drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54: int enum mlxsw_sp_l3proto versus drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:255:54: int enum mlxsw_reg_ralxx_protocol ... drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1749:6: warning: symbol 'mlxsw_sp_fib_entry_put' was not declared. Should it be static? Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 9月, 2016 1 次提交
-
-
由 Jiri Pirko 提交于
When neigh_init fails, we have to do proper cleanup including router_fini call. Fixes: 6cf3c971 ("mlxsw: spectrum_router: Add private neigh table") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Acked-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 9月, 2016 3 次提交
-
-
由 Jiri Pirko 提交于
Currently the notifier is registered for every asic instance, however the same block. Fix this by moving the registration to module init. Fixes: c723c735 ("mlxsw: spectrum_router: Periodically update the kernel's neigh table") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Originally, I expected that there would be needed to call update operation in case RALUE record action is changed. However, that is not needed since write operation takes care of that nicely. Remove prepared construct and always call the write operation. Fixes: 61c503f9 ("mlxsw: spectrum_router: Implement fib4 add/del switchdev obj ops") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
In mlxsw we squash tables 254 and 255 together into HW. Kernel adds/dels /32 ip to/from both 254 and 255. On del path, that causes the same prefix being removed twice. Fix this by introducing reference counting for private mlxsw fib entries. That required a bit of code reshuffle. Also put dev into fib entry key so the same prefix could be represented once per every router interface. Fixes: 61c503f9 ("mlxsw: spectrum_router: Implement fib4 add/del switchdev obj ops") Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 8月, 2016 1 次提交
-
-
由 Yotam Gigi 提交于
Make the function mlxsw_router_neigh_construct search the rif according to the neighbour dev other than the dev that was passed to the ndo, thus allowing creating neigbhours upon stacked devices. Fixes: 6cf3c971 ("mlxsw: spectrum_router: Add private neigh table") Signed-off-by: NYotam Gigi <yotamg@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 8月, 2016 1 次提交
-
-
由 Vincent 提交于
In mlxsw_sp_router_fib4_add_info_destroy(), the fib_entry pointer is used after it has been freed by mlxsw_sp_fib_entry_destroy(). Use a temporary variable to fix this. Fixes: 61c503f9 ("mlxsw: spectrum_router: Implement fib4 add/del switchdev obj ops") Signed-off-by: NVincent Stehlé <vincent.stehle@laposte.net> Cc: Jiri Pirko <jiri@mellanox.com> Acked-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 7月, 2016 1 次提交
-
-
由 Christophe Jaillet 提交于
'vr' should be a valid pointer here, so returning 'PTR_ERR(vr)' is wrong. Return an explicit error code (-ENOENT) instead. Fixes: 61c503f9 ("mlxsw: spectrum_router: Implement fib4 add/del switchdev obj ops") Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 7月, 2016 6 次提交
-
-
由 Yotam Gigi 提交于
Now, the driver sends arp probes for all unresolved neighbours that are currently a nexthop for some route on the system. The job is set periodically every 5 seconds. Signed-off-by: NYotam Gigi <yotamg@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yotam Gigi 提交于
For nexthop neighbours we need to make kernel to think there is a traffic flowing to them preventing it from going to stale state. Otherwise kernel would stale it and eventually the neigh would be removed from HW and nexthop as well. That would reduce ECMP group in HW. Signed-off-by: NYotam Gigi <yotamg@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Implement next-hop routing offload including ECMP. To make it possible, introduce next-hop group entity. This entity keeps track of resolved neighbours and updates HW adjacency table accordingly. Note that HW next-hops are stored in this adjacency table, in form of MAC. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yotam Gigi 提交于
Listen to any NEIGH_UPDATE events sent and program the device accordingly. If NUD state is VALID and neighbour isn't yet offloaded, then program it into the device's table. Otherwise, just edit its parameters. If NUD state machine transitioned neighbour out of VALID state and it's present in the device's table, then remove it. Note that the device is programmed in delayed work, as the netevent notification chain is atomic and prevents us from going to sleep. Signed-off-by: NYotam Gigi <yotamg@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yotam Gigi 提交于
As previously explained, the driver should periodically poll the device for neighbours activity according to the configured DELAY_PROBE_TIME. This will prevent active neighbours from staying in STALE state for long periods of time. During init configure the polling interval according to the DELAY_PROBE_TIME used in the default table. In addition, register a netevent notification block, so that the interval is updated whenever DELAY_PROBE_TIME changes. Using the computed interval schedule a delayed work, which will update the kernel via neigh_event_send() on any active neighbour since the last delayed work. Signed-off-by: NYotam Gigi <yotamg@mellanox.com> Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
We need to hold some private data for every neigh entry. It would be possible to do it using neigh_priv_len/ndo_neigh_construct/ ndo_neigh_destroy however only for the port device itself. That would not work for stacked devices like bridge/team/bond. So introduce a private neigh table. Hook onto ndos neigh_construct/destroy and add/remove table entry according to that. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 7月, 2016 4 次提交
-
-
由 Jiri Pirko 提交于
Implement ipv4 FIB entries addition and removal. Initially, we support local and broadcast routes using "ip2me" trap action. Also, unicast routes without nexthop are supported using "local" action. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Virtual router is a construct used inside HW. In this implementation we map kernel tables to virtual routers one to one. Introduce management logic to create virtual routers when needed and destroy in case they are no longer in use. According to that, call into LPM tree management. Each virtual router is always bound to one LPM tree. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Introduce basic LPM tree management allowing to share the trees in between tables if the used prefixes in the tables are the same. Build the tree structure according to the used prefixes. Although it is not optimal for many use cases, this initial implementation does only simple linear left-tree. More advanced structures will be introduced later on, possibly including mechanisms to change trees on the fly. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Pirko 提交于
Shadow FIB is needed in order to hold additional information for FIB entries and keep track of used prefixes. That is needed for the LPM tree construction to be introduced later on in this set. Signed-off-by: NJiri Pirko <jiri@mellanox.com> Reviewed-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 7月, 2016 1 次提交
-
-
由 Ido Schimmel 提交于
Create a skeleton router file and do basic HW initialization of router. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 9月, 2014 1 次提交
-
-
由 Erik Hugne 提交于
TIPC name table updates are distributed asynchronously in a cluster, entailing a risk of certain race conditions. E.g., if two nodes simultaneously issue conflicting (overlapping) publications, this may not be detected until both publications have reached a third node, in which case one of the publications will be silently dropped on that node. Hence, we end up with an inconsistent name table. In most cases this conflict is just a temporary race, e.g., one node is issuing a publication under the assumption that a previous, conflicting, publication has already been withdrawn by the other node. However, because of the (rtt related) distributed update delay, this may not yet hold true on all nodes. The symptom of this failure is a syslog message: "tipc: Cannot publish {%u,%u,%u}, overlap error". In this commit we add a resiliency queue at the receiving end of the name table distributor. When insertion of an arriving publication fails, we retain it in this queue for a short amount of time, assuming that another update will arrive very soon and clear the conflict. If so happens, we insert the publication, otherwise we drop it. The (configurable) retention value defaults to 2000 ms. Knowing from experience that the situation described above is extremely rare, there is no risk that the queue will accumulate any large number of items. Signed-off-by: NErik Hugne <erik.hugne@ericsson.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-