- 03 5月, 2017 1 次提交
-
-
由 Jon Paul Maloy 提交于
We try to make this function more readable by improving variable names and comments, plus some minor changes to the logics. Reviewed-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 4月, 2017 3 次提交
-
-
由 Parthasarathy Bhuvaragan 提交于
Until now in tipc_recv_stream(), we update the received unacknowledged bytes based on a stack variable and not based on the actual message size. If the user buffer passed at tipc_recv_stream() is smaller than the received skb, the size variable in stack differs from the actual message size in the skb. This leads to a flow control accounting error causing permanent congestion. In this commit, we fix this accounting error by always using the size of the incoming message. Fixes: 10724cc7 ("tipc: redesign connection-level flow control") Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Parthasarathy Bhuvaragan 提交于
Until now in tipc_send_stream(), we return -1 when the socket encounters link congestion even if the socket had successfully sent partial data. This is incorrect as the application resends the same the partial data leading to data corruption at receiver's end. In this commit, we return the partially sent bytes as the return value at link congestion. Fixes: 10724cc7 ("tipc: redesign connection-level flow control") Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Pan Bian 提交于
Function nlmsg_new() will return a NULL pointer if there is no enough memory, and its return value should be checked before it is used. However, in function tipc_nl_node_get_monitor(), the validation of the return value of function nlmsg_new() is missed. This patch fixes the bug. Signed-off-by: NPan Bian <bianpan2016@163.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 4月, 2017 2 次提交
-
-
由 Johannes Berg 提交于
This is an add-on to the previous patch that passes the extended ACK structure where it's already available by existing genl_info or extack function arguments. This was done with this spatch (with some manual adjustment of indentation): @@ expression A, B, C, D, E; identifier fn, info; @@ fn(..., struct genl_info *info, ...) { ... -nlmsg_parse(A, B, C, D, E, NULL) +nlmsg_parse(A, B, C, D, E, info->extack) ... } @@ expression A, B, C, D, E; identifier fn, info; @@ fn(..., struct genl_info *info, ...) { <... -nla_parse_nested(A, B, C, D, NULL) +nla_parse_nested(A, B, C, D, info->extack) ...> } @@ expression A, B, C, D, E; identifier fn, extack; @@ fn(..., struct netlink_ext_ack *extack, ...) { <... -nlmsg_parse(A, B, C, D, E, NULL) +nlmsg_parse(A, B, C, D, E, extack) ...> } @@ expression A, B, C, D, E; identifier fn, extack; @@ fn(..., struct netlink_ext_ack *extack, ...) { <... -nla_parse(A, B, C, D, E, NULL) +nla_parse(A, B, C, D, E, extack) ...> } @@ expression A, B, C, D, E; identifier fn, extack; @@ fn(..., struct netlink_ext_ack *extack, ...) { ... -nlmsg_parse(A, B, C, D, E, NULL) +nlmsg_parse(A, B, C, D, E, extack) ... } @@ expression A, B, C, D; identifier fn, extack; @@ fn(..., struct netlink_ext_ack *extack, ...) { <... -nla_parse_nested(A, B, C, D, NULL) +nla_parse_nested(A, B, C, D, extack) ...> } @@ expression A, B, C, D; identifier fn, extack; @@ fn(..., struct netlink_ext_ack *extack, ...) { <... -nlmsg_validate(A, B, C, D, NULL) +nlmsg_validate(A, B, C, D, extack) ...> } @@ expression A, B, C, D; identifier fn, extack; @@ fn(..., struct netlink_ext_ack *extack, ...) { <... -nla_validate(A, B, C, D, NULL) +nla_validate(A, B, C, D, extack) ...> } @@ expression A, B, C; identifier fn, extack; @@ fn(..., struct netlink_ext_ack *extack, ...) { <... -nla_validate_nested(A, B, C, NULL) +nla_validate_nested(A, B, C, extack) ...> } Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Reviewed-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Johannes Berg 提交于
Pass the new extended ACK reporting struct to all of the generic netlink parsing functions. For now, pass NULL in almost all callers (except for some in the core.) Signed-off-by: NJohannes Berg <johannes.berg@intel.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 30 3月, 2017 2 次提交
-
-
由 Erik Hugne 提交于
for socketpairs using connectionless transport, we cache the respective node local TIPC portid to use in subsequent calls to send() in the socket's private data. Signed-off-by: NErik Hugne <erik.hugne@gmail.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Erik Hugne 提交于
sockets A and B are connected back-to-back, similar to what AF_UNIX does. Signed-off-by: NErik Hugne <erik.hugne@gmail.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 3月, 2017 2 次提交
-
-
由 Ying Xue 提交于
When a new subscription object is inserted into name_seq->subscriptions list, it's under name_seq->lock protection; when a subscription is deleted from the list, it's also under the same lock protection; similarly, when accessing a subscription by going through subscriptions list, the entire process is also protected by the name_seq->lock. Therefore, if subscription refcount is increased before it's inserted into subscriptions list, and its refcount is decreased after it's deleted from the list, it will be unnecessary to hold refcount at all before accessing subscription object which is obtained by going through subscriptions list under name_seq->lock protection. Signed-off-by: NYing Xue <ying.xue@windriver.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
After a subscription object is created, it's inserted into its subscriber subscrp_list list under subscriber lock protection, similarly, before it's destroyed, it should be first removed from its subscriber->subscrp_list. Since the subscription list is accessed with subscriber lock, all the subscriptions are valid during the lock duration. Hence in tipc_subscrb_subscrp_delete(), we remove subscription get/put and the extra subscriber unlock/lock. After this change, the subscriptions refcount cleanup is very simple and does not access any lock. Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 3月, 2017 1 次提交
-
-
由 Ying Xue 提交于
Until now, tipc_nametbl_unsubscribe() is called at subscriptions reference count cleanup. Usually the subscriptions cleanup is called at subscription timeout or at subscription cancel or at subscriber delete. We have ignored the possibility of this being called from other locations, which causes deadlock as we try to grab the tn->nametbl_lock while holding it already. CPU1: CPU2: ---------- ---------------- tipc_nametbl_publish spin_lock_bh(&tn->nametbl_lock) tipc_nametbl_insert_publ tipc_nameseq_insert_publ tipc_subscrp_report_overlap tipc_subscrp_get tipc_subscrp_send_event tipc_close_conn tipc_subscrb_release_cb tipc_subscrb_delete tipc_subscrp_put tipc_subscrp_put tipc_subscrp_kref_release tipc_nametbl_unsubscribe spin_lock_bh(&tn->nametbl_lock) <<grab nametbl_lock again>> CPU1: CPU2: ---------- ---------------- tipc_nametbl_stop spin_lock_bh(&tn->nametbl_lock) tipc_purge_publications tipc_nameseq_remove_publ tipc_subscrp_report_overlap tipc_subscrp_get tipc_subscrp_send_event tipc_close_conn tipc_subscrb_release_cb tipc_subscrb_delete tipc_subscrp_put tipc_subscrp_put tipc_subscrp_kref_release tipc_nametbl_unsubscribe spin_lock_bh(&tn->nametbl_lock) <<grab nametbl_lock again>> In this commit, we advance the calling of tipc_nametbl_unsubscribe() from the refcount cleanup to the intended callers. Fixes: d094c4d5 ("tipc: add subscription refcount to avoid invalid delete") Reported-by: NJohn Thompson <thompa.atl@gmail.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 10 3月, 2017 1 次提交
-
-
由 David Howells 提交于
Lockdep issues a circular dependency warning when AFS issues an operation through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem. The theory lockdep comes up with is as follows: (1) If the pagefault handler decides it needs to read pages from AFS, it calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but creating a call requires the socket lock: mmap_sem must be taken before sk_lock-AF_RXRPC (2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind() binds the underlying UDP socket whilst holding its socket lock. inet_bind() takes its own socket lock: sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET (3) Reading from a TCP socket into a userspace buffer might cause a fault and thus cause the kernel to take the mmap_sem, but the TCP socket is locked whilst doing this: sk_lock-AF_INET must be taken before mmap_sem However, lockdep's theory is wrong in this instance because it deals only with lock classes and not individual locks. The AF_INET lock in (2) isn't really equivalent to the AF_INET lock in (3) as the former deals with a socket entirely internal to the kernel that never sees userspace. This is a limitation in the design of lockdep. Fix the general case by: (1) Double up all the locking keys used in sockets so that one set are used if the socket is created by userspace and the other set is used if the socket is created by the kernel. (2) Store the kern parameter passed to sk_alloc() in a variable in the sock struct (sk_kern_sock). This informs sock_lock_init(), sock_init_data() and sk_clone_lock() as to the lock keys to be used. Note that the child created by sk_clone_lock() inherits the parent's kern setting. (3) Add a 'kern' parameter to ->accept() that is analogous to the one passed in to ->create() that distinguishes whether kernel_accept() or sys_accept4() was the caller and can be passed to sk_alloc(). Note that a lot of accept functions merely dequeue an already allocated socket. I haven't touched these as the new socket already exists before we get the parameter. Note also that there are a couple of places where I've made the accepted socket unconditionally kernel-based: irda_accept() rds_rcp_accept_one() tcp_accept_from_sock() because they follow a sock_create_kern() and accept off of that. Whilst creating this, I noticed that lustre and ocfs don't create sockets through sock_create_kern() and thus they aren't marked as for-kernel, though they appear to be internal. I wonder if these should do that so that they use the new set of lock keys. Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 3月, 2017 1 次提交
-
-
由 Ingo Molnar 提交于
sched/headers: Prepare to move signal wakeup & sigpending methods from <linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 25 2月, 2017 1 次提交
-
-
由 Jon Paul Maloy 提交于
In the function tipc_rcv() we initialize a couple of stack variables from the message header before that same header has been validated. In rare cases when the arriving header is non-linar, the validation function itself may linearize the buffer by calling skb_may_pull(), while the wrongly initialized stack fields are not updated accordingly. We fix this in this commit. Reported-by: NMatthew Wong <mwong@sonusnet.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 2月, 2017 1 次提交
-
-
由 Herbert Xu 提交于
There are two problems with the function tipc_sk_reinit. Firstly it's doing a manual walk over an rhashtable. This is broken as an rhashtable can be resized and if you manually walk over it during a resize then you may miss entries. Secondly it's missing memory barriers as previously the code used spinlocks which provide the barriers implicitly. This patch fixes both problems. Fixes: 07f6c4bc ("tipc: convert tipc reference table to...") Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 2月, 2017 1 次提交
-
-
由 David S. Miller 提交于
This reverts commits: 6a254780 9dbbfb0a 40137906 It's too risky to put in this late in the release cycle. We'll put these changes into the next merge window instead. Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 2月, 2017 1 次提交
-
-
由 Herbert Xu 提交于
There are two problems with the function tipc_sk_reinit. Firstly it's doing a manual walk over an rhashtable. This is broken as an rhashtable can be resized and if you manually walk over it during a resize then you may miss entries. Secondly it's missing memory barriers as previously the code used spinlocks which provide the barriers implicitly. This patch fixes both problems. Fixes: 07f6c4bc ("tipc: convert tipc reference table to...") Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 1月, 2017 1 次提交
-
-
由 Dan Carpenter 提交于
We shuffled some code around and added some new case statements here and now "res" isn't initialized on all paths. Fixes: 01fd12bb ("tipc: make replicast a user selectable option") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 1月, 2017 6 次提交
-
-
由 Parthasarathy Bhuvaragan 提交于
In tipc_server_stop(), we iterate over the connections with limiting factor as server's idr_in_use. We ignore the fact that this variable is decremented in tipc_close_conn(), leading to premature exit. In this commit, we iterate until the we have no connections left. Acked-by: NYing Xue <ying.xue@windriver.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Tested-by: NJohn Thompson <thompa.atl@gmail.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Parthasarathy Bhuvaragan 提交于
In tipc_conn_sendmsg(), we first queue the request to the outqueue followed by the connection state check. If the connection is not connected, we should not queue this message. In this commit, we reject the messages if the connection state is not CF_CONNECTED. Acked-by: NYing Xue <ying.xue@windriver.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Tested-by: NJohn Thompson <thompa.atl@gmail.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Parthasarathy Bhuvaragan 提交于
Commit 333f7962 ("tipc: fix a race condition leading to subscriber refcnt bug") reveals a soft lockup while acquiring nametbl_lock. Before commit 333f7962, we call tipc_conn_shutdown() from tipc_close_conn() in the context of tipc_topsrv_stop(). In that context, we are allowed to grab the nametbl_lock. Commit 333f7962, moved tipc_conn_release (renamed from tipc_conn_shutdown) to the connection refcount cleanup. This allows either tipc_nametbl_withdraw() or tipc_topsrv_stop() to the cleanup. Since tipc_exit_net() first calls tipc_topsrv_stop() and then tipc_nametble_withdraw() increases the chances for the later to perform the connection cleanup. The soft lockup occurs in the call chain of tipc_nametbl_withdraw(), when it performs the tipc_conn_kref_release() as it tries to grab nametbl_lock again while holding it already. tipc_nametbl_withdraw() grabs nametbl_lock tipc_nametbl_remove_publ() tipc_subscrp_report_overlap() tipc_subscrp_send_event() tipc_conn_sendmsg() << if (con->flags != CF_CONNECTED) we do conn_put(), triggering the cleanup as refcount=0. >> tipc_conn_kref_release tipc_sock_release tipc_conn_release tipc_subscrb_delete tipc_subscrp_delete tipc_nametbl_unsubscribe << Soft Lockup >> The previous changes in this series fixes the race conditions fixed by commit 333f7962. Hence we can now revert the commit. Fixes: 333f7962 ("tipc: fix a race condition leading to subscriber refcnt bug") Reported-and-Tested-by: NJohn Thompson <thompa.atl@gmail.com> Acked-by: NYing Xue <ying.xue@windriver.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Parthasarathy Bhuvaragan 提交于
Until now, the generic server framework maintains the connection id's per subscriber in server's conn_idr. At tipc_close_conn, we remove the connection id from the server list, but the connection is valid until we call the refcount cleanup. Hence we have a window where the server allocates the same connection to an new subscriber leading to inconsistent reference count. We have another refcount warning we grab the refcount in tipc_conn_lookup() for connections with flag with CF_CONNECTED not set. This usually occurs at shutdown when the we stop the topology server and withdraw TIPC_CFG_SRV publication thereby triggering a withdraw message to subscribers. In this commit, we: 1. remove the connection from the server list at recount cleanup. 2. grab the refcount for a connection only if CF_CONNECTED is set. Tested-by: NJohn Thompson <thompa.atl@gmail.com> Acked-by: NYing Xue <ying.xue@windriver.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Parthasarathy Bhuvaragan 提交于
Until now, the subscribers keep track of the subscriptions using reference count at subscriber level. At subscription cancel or subscriber delete, we delete the subscription only if the timer was pending for the subscription. This approach is incorrect as: 1. del_timer() is not SMP safe, if on CPU0 the check for pending timer returns true but CPU1 might schedule the timer callback thereby deleting the subscription. Thus when CPU0 is scheduled, it deletes an invalid subscription. 2. We export tipc_subscrp_report_overlap(), which accesses the subscription pointer multiple times. Meanwhile the subscription timer can expire thereby freeing the subscription and we might continue to access the subscription pointer leading to memory violations. In this commit, we introduce subscription refcount to avoid deleting an invalid subscription. Reported-and-Tested-by: NJohn Thompson <thompa.atl@gmail.com> Acked-by: NYing Xue <ying.xue@windriver.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Parthasarathy Bhuvaragan 提交于
We trigger a soft lockup as we grab nametbl_lock twice if the node has a pending node up/down or link up/down event while: - we process an incoming named message in tipc_named_rcv() and perform an tipc_update_nametbl(). - we have pending backlog items in the name distributor queue during a nametable update using tipc_nametbl_publish() or tipc_nametbl_withdraw(). The following are the call chain associated: tipc_named_rcv() Grabs nametbl_lock tipc_update_nametbl() (publish/withdraw) tipc_node_subscribe()/unsubscribe() tipc_node_write_unlock() << lockup occurs if an outstanding node/link event exits, as we grabs nametbl_lock again >> tipc_nametbl_withdraw() Grab nametbl_lock tipc_named_process_backlog() tipc_update_nametbl() << rest as above >> The function tipc_node_write_unlock(), in addition to releasing the lock processes the outstanding node/link up/down events. To do this, we need to grab the nametbl_lock again leading to the lockup. In this commit we fix the soft lockup by introducing a fast variant of node_unlock(), where we just release the lock. We adapt the node_subscribe()/node_unsubscribe() to use the fast variants. Reported-and-Tested-by: NJohn Thompson <thompa.atl@gmail.com> Acked-by: NYing Xue <ying.xue@windriver.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 1月, 2017 4 次提交
-
-
由 Jon Paul Maloy 提交于
If the bearer carrying multicast messages supports broadcast, those messages will be sent to all cluster nodes, irrespective of whether these nodes host any actual destinations socket or not. This is clearly wasteful if the cluster is large and there are only a few real destinations for the message being sent. In this commit we extend the eligibility of the newly introduced "replicast" transmit option. We now make it possible for a user to select which method he wants to be used, either as a mandatory setting via setsockopt(), or as a relative setting where we let the broadcast layer decide which method to use based on the ratio between cluster size and the message's actual number of destination nodes. In the latter case, a sending socket must stick to a previously selected method until it enters an idle period of at least 5 seconds. This eliminates the risk of message reordering caused by method change, i.e., when changes to cluster size or number of destinations would otherwise mandate a new method to be used. Reviewed-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
TIPC multicast messages are currently carried over a reliable 'broadcast link', making use of the underlying media's ability to transport packets as L2 broadcast or IP multicast to all nodes in the cluster. When the used bearer is lacking that ability, we can instead emulate the broadcast service by replicating and sending the packets over as many unicast links as needed to reach all identified destinations. We now introduce a new TIPC link-level 'replicast' service that does this. Reviewed-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
As a further preparation for the upcoming 'replicast' functionality, we add some necessary structs and functions for looking up and returning a list of all nodes that host destinations for a given multicast message. Reviewed-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
As a preparation for the 'replicast' functionality we are going to introduce in the next commits, we need the broadcast base structure to store whether bearer broadcast is available at all from the currently used bearer or bearers. We do this by adding a new function tipc_bearer_bcast_support() to the bearer layer, and letting the bearer selection function in bcast.c use this to give a new boolean field, 'bcast_support' the appropriate value. Reviewed-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 1月, 2017 1 次提交
-
-
由 Parthasarathy Bhuvaragan 提交于
Until now, we allocate memory always with GFP_ATOMIC flag. When the system is under memory pressure and a user tries to send, the send fails due to low memory. However, the user application can wait for free memory if we allocate it using GFP_KERNEL flag. In this commit, we use allocate memory with GFP_KERNEL for all user allocation. Reported-by: NRune Torgersen <runet@innovsys.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 1月, 2017 3 次提交
-
-
由 Jon Paul Maloy 提交于
The socket code currently handles link congestion by either blocking and trying to send again when the congestion has abated, or just returning to the user with -EAGAIN and let him re-try later. This mechanism is prone to starvation, because the wakeup algorithm is non-atomic. During the time the link issues a wakeup signal, until the socket wakes up and re-attempts sending, other senders may have come in between and occupied the free buffer space in the link. This in turn may lead to a socket having to make many send attempts before it is successful. In extremely loaded systems we have observed latency times of several seconds before a low-priority socket is able to send out a message. In this commit, we simplify this mechanism and reduce the risk of the described scenario happening. When a message is attempted sent via a congested link, we now let it be added to the link's backlog queue anyway, thus permitting an oversubscription of one message per source socket. We still create a wakeup item and return an error code, hence instructing the sender to block or stop sending. Only when enough space has been freed up in the link's backlog queue do we issue a wakeup event that allows the sender to continue with the next message, if any. The fact that a socket now can consider a message sent even when the link returns a congestion code means that the sending socket code can be simplified. Also, since this is a good opportunity to get rid of the obsolete 'mtu change' condition in the three socket send functions, we now choose to refactor those functions completely. Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
During multicast reception we currently use a simple linked list with push/pop semantics to store port numbers. We now see a need for a more generic list for storing values of type u32. We therefore make some modifications to this list, while replacing the prefix 'tipc_plist_' with 'u32_'. We also add a couple of new functions which will come to use in the next commits. Acked-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
The functions tipc_wait_for_sndpkt() and tipc_wait_for_sndmsg() are very similar. The latter function is also called from two locations, and there will be more in the coming commits, which will all need to test on different conditions. Instead of making yet another duplicates of the function, we now introduce a new macro tipc_wait_for_cond() where the wakeup condition can be stated as an argument to the call. This macro replaces all current and future uses of the two functions, which can now be eliminated. Acked-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 12月, 2016 1 次提交
-
-
由 Jon Paul Maloy 提交于
In commit 6f00089c ("tipc: remove SS_DISCONNECTING state") the check for socket type is in the wrong place, causing a closing socket to always send out a FIN message even when the socket was never connected. This is normally harmless, since the destination node for such messages most often is zero, and the message will be dropped, but it is still a wrong and confusing behavior. We fix this in this commit. Reviewed-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 12月, 2016 1 次提交
-
-
由 Al Viro 提交于
copy_from_iter_full(), copy_from_iter_full_nocache() and csum_and_copy_from_iter_full() - counterparts of copy_from_iter() et.al., advancing iterator only in case of successful full copy and returning whether it had been successful or not. Convert some obvious users. *NOTE* - do not blindly assume that something is a good candidate for those unless you are sure that not advancing iov_iter in failure case is the right thing in this case. Anything that does short read/short write kind of stuff (or is in a loop, etc.) is unlikely to be a good one. Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
-
- 03 12月, 2016 1 次提交
-
-
由 Michal Kubeček 提交于
Qian Zhang (张谦) reported a potential socket buffer overflow in tipc_msg_build() which is also known as CVE-2016-8632: due to insufficient checks, a buffer overflow can occur if MTU is too short for even tipc headers. As anyone can set device MTU in a user/net namespace, this issue can be abused by a regular user. As agreed in the discussion on Ben Hutchings' original patch, we should check the MTU at the moment a bearer is attached rather than for each processed packet. We also need to repeat the check when bearer MTU is adjusted to new device MTU. UDP case also needs a check to avoid overflow when calculating bearer MTU. Fixes: b97bf3fd ("[TIPC] Initial merge") Signed-off-by: NMichal Kubecek <mkubecek@suse.cz> Reported-by: NQian Zhang (张谦) <zhangqian-c@360.cn> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 11月, 2016 1 次提交
-
-
由 Jon Paul Maloy 提交于
In commit e4bf4f76 ("tipc: simplify packet sequence number handling") we changed the internal representation of the packet sequence number counters from u32 to u16, reflecting what is really sent over the wire. Since then some link statistics counters have been displaying incorrect values, partially because the counters meant to be used as sequence number snapshots are now used as direct counters, stored as u32, and partially because some counter updates are just missing in the code. In this commit we correct this in two ways. First, we base the displayed packet sent/received values on direct counters instead of as previously a calculated difference between current sequence number and a snapshot. Second, we add the missing updates of the counters. This change is compatible with the current netlink API, and requires no changes to the user space tools. Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 11月, 2016 3 次提交
-
-
由 Jon Paul Maloy 提交于
In commit 10724cc7 ("tipc: redesign connection-level flow control") we replaced the previous message based flow control with one based on 1k blocks. In order to ensure backwards compatibility the mechanism falls back to using message as base unit when it senses that the peer doesn't support the new algorithm. The default flow control window, i.e., how many units can be sent before the sender blocks and waits for an acknowledge (aka advertisement) is 512. This was tested against the previous version, which uses an acknowledge frequency of on ack per 256 received message, and found to work fine. However, we missed the fact that versions older than Linux 3.15 use an acknowledge frequency of 512, which is exactly the limit where a 4.6+ sender will stop and wait for acknowledge. This would also work fine if it weren't for the fact that if the first sent message on a 4.6+ server side is an empty SYNACK, this one is also is counted as a sent message, while it is not counted as a received message on a legacy 3.15-receiver. This leads to the sender always being one step ahead of the receiver, a scenario causing the sender to block after 512 sent messages, while the receiver only has registered 511 read messages. Hence, the legacy receiver is not trigged to send an acknowledge, with a permanently blocked sender as result. We solve this deadlock by simply allowing the sender to send one more message before it blocks, i.e., by a making minimal change to the condition used for determining connection congestion. Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
In commit 35c55c98 ("tipc: add neighbor monitoring framework") we added a data area to the link monitor STATE messages under the assumption that previous versions did not use any such data area. For versions older than Linux 4.3 this assumption is not correct. In those version, all STATE messages sent out from a node inadvertently contain a 16 byte data area containing a string; -a leftover from previous RESET messages which were using this during the setup phase. This string serves no purpose in STATE messages, and should no be there. Unfortunately, this data area is delivered to the link monitor framework, where a sanity check catches that it is not a correct domain record, and drops it. It also issues a rate limited warning about the event. Since such events occur much more frequently than anticipated, we now choose to remove the warning in order to not fill the kernel log with useless contents. We also make the sanity check stricter, to further reduce the risk that such data is inavertently admitted. Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
commit 81729810 ("tipc: fix link priority propagation") introduced a compatibility problem between TIPC versions newer than Linux 4.6 and those older than Linux 4.4. In versions later than 4.4, link STATE messages only contain a non-zero link priority value when the sender wants the receiver to change its priority. This has the effect that the receiver resets itself in order to apply the new priority. This works well, and is consistent with the said commit. However, in versions older than 4.4 a valid link priority is present in all sent link STATE messages, leading to cyclic link establishment and reset on the 4.6+ node. We fix this by adding a test that the received value should not only be valid, but also differ from the current value in order to cause the receiving link endpoint to reset. Reported-by: NAmar Nv <amar.nv005@gmail.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 11月, 2016 1 次提交
-
-
由 Jon Paul Maloy 提交于
The comment block in socket.c describing the locking policy is obsolete, and does not reflect current reality. We remove it in this commit. Since the current locking policy is much simpler and follows a mainstream approach, we see no need to add a new description. Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-