- 18 6月, 2016 1 次提交
-
-
由 Jon Paul Maloy 提交于
We sometimes observe a 'deadly embrace' type deadlock occurring between mutually connected sockets on the same node. This happens when the one-hour peer supervision timers happen to expire simultaneously in both sockets. The scenario is as follows: CPU 1: CPU 2: -------- -------- tipc_sk_timeout(sk1) tipc_sk_timeout(sk2) lock(sk1.slock) lock(sk2.slock) msg_create(probe) msg_create(probe) unlock(sk1.slock) unlock(sk2.slock) tipc_node_xmit_skb() tipc_node_xmit_skb() tipc_node_xmit() tipc_node_xmit() tipc_sk_rcv(sk2) tipc_sk_rcv(sk1) lock(sk2.slock) lock((sk1.slock) filter_rcv() filter_rcv() tipc_sk_proto_rcv() tipc_sk_proto_rcv() msg_create(probe_rsp) msg_create(probe_rsp) tipc_sk_respond() tipc_sk_respond() tipc_node_xmit_skb() tipc_node_xmit_skb() tipc_node_xmit() tipc_node_xmit() tipc_sk_rcv(sk1) tipc_sk_rcv(sk2) lock((sk1.slock) lock((sk2.slock) ===> DEADLOCK ===> DEADLOCK Further analysis reveals that there are three different locations in the socket code where tipc_sk_respond() is called within the context of the socket lock, with ensuing risk of similar deadlocks. We now solve this by passing a buffer queue along with all upcalls where sk_lock.slock may potentially be held. Response or rejected message buffers are accumulated into this queue instead of being sent out directly, and only sent once we know we are safely outside the slock context. Reported-by: NGUNA <gbalasun@gmail.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 6月, 2016 2 次提交
-
-
由 Ying Xue 提交于
net/tipc/link.c: In function ‘tipc_link_timeout’: net/tipc/link.c:744:28: warning: ‘mtyp’ may be used uninitialized in this function [-Wuninitialized] Fixes: 42b18f60 ("tipc: refactor function tipc_link_timeout()") Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ying Xue 提交于
When run tipcTS&tipcTC test suite, the following complaint appears: [ 56.926168] =============================== [ 56.926169] [ INFO: suspicious RCU usage. ] [ 56.926171] 4.7.0-rc1+ #160 Not tainted [ 56.926173] ------------------------------- [ 56.926174] net/tipc/bearer.c:408 suspicious rcu_dereference_protected() usage! [ 56.926175] [ 56.926175] other info that might help us debug this: [ 56.926175] [ 56.926177] [ 56.926177] rcu_scheduler_active = 1, debug_locks = 1 [ 56.926179] 3 locks held by swapper/4/0: [ 56.926180] #0: (((&req->timer))){+.-...}, at: [<ffffffff810e79b5>] call_timer_fn+0x5/0x340 [ 56.926203] #1: (&(&req->lock)->rlock){+.-...}, at: [<ffffffffa000c29b>] disc_timeout+0x1b/0xd0 [tipc] [ 56.926212] #2: (rcu_read_lock){......}, at: [<ffffffffa00055e0>] tipc_bearer_xmit_skb+0xb0/0x2e0 [tipc] [ 56.926218] [ 56.926218] stack backtrace: [ 56.926221] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.7.0-rc1+ #160 [ 56.926222] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 56.926224] 0000000000000000 ffff880016803d28 ffffffff813c4423 ffff8800154252c0 [ 56.926227] 0000000000000001 ffff880016803d58 ffffffff810b7512 ffff8800124d8120 [ 56.926230] ffff880013f8a160 ffff8800132b5ccc ffff8800124d8120 ffff880016803d88 [ 56.926234] Call Trace: [ 56.926235] <IRQ> [<ffffffff813c4423>] dump_stack+0x67/0x94 [ 56.926250] [<ffffffff810b7512>] lockdep_rcu_suspicious+0xe2/0x120 [ 56.926256] [<ffffffffa00051f1>] tipc_l2_send_msg+0x131/0x1c0 [tipc] [ 56.926261] [<ffffffffa000567c>] tipc_bearer_xmit_skb+0x14c/0x2e0 [tipc] [ 56.926266] [<ffffffffa00055e0>] ? tipc_bearer_xmit_skb+0xb0/0x2e0 [tipc] [ 56.926273] [<ffffffffa000c280>] ? tipc_disc_init_msg+0x1f0/0x1f0 [tipc] [ 56.926278] [<ffffffffa000c280>] ? tipc_disc_init_msg+0x1f0/0x1f0 [tipc] [ 56.926283] [<ffffffffa000c2d6>] disc_timeout+0x56/0xd0 [tipc] [ 56.926288] [<ffffffff810e7a68>] call_timer_fn+0xb8/0x340 [ 56.926291] [<ffffffff810e79b5>] ? call_timer_fn+0x5/0x340 [ 56.926296] [<ffffffffa000c280>] ? tipc_disc_init_msg+0x1f0/0x1f0 [tipc] [ 56.926300] [<ffffffff810e8f4a>] run_timer_softirq+0x23a/0x390 [ 56.926306] [<ffffffff810f89ff>] ? clockevents_program_event+0x7f/0x130 [ 56.926316] [<ffffffff819727c3>] __do_softirq+0xc3/0x4a2 [ 56.926323] [<ffffffff8106ba5a>] irq_exit+0x8a/0xb0 [ 56.926327] [<ffffffff81972456>] smp_apic_timer_interrupt+0x46/0x60 [ 56.926331] [<ffffffff81970a49>] apic_timer_interrupt+0x89/0x90 [ 56.926333] <EOI> [<ffffffff81027fda>] ? default_idle+0x2a/0x1a0 [ 56.926340] [<ffffffff81027fd8>] ? default_idle+0x28/0x1a0 [ 56.926342] [<ffffffff810289cf>] arch_cpu_idle+0xf/0x20 [ 56.926345] [<ffffffff810adf0f>] default_idle_call+0x2f/0x50 [ 56.926347] [<ffffffff810ae145>] cpu_startup_entry+0x215/0x3e0 [ 56.926353] [<ffffffff81040ad9>] start_secondary+0xf9/0x100 The warning appears as rtnl_dereference() is wrongly used in tipc_l2_send_msg() under RCU read lock protection. Instead the proper usage should be that rcu_dereference_rtnl() is called here. Fixes: 5b7066c3 ("tipc: stricter filtering of packets in bearer layer") Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 6月, 2016 1 次提交
-
-
由 Kangjie Lu 提交于
link_info.str is a char array of size 60. Memory after the NULL byte is not initialized. Sending the whole object out can cause a leak. Signed-off-by: NKangjie Lu <kjlu@gatech.edu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 5月, 2016 1 次提交
-
-
由 Baozeng Ding 提交于
Before calling the nla_parse_nested function, make sure the pointer to the attribute is not null. This patch fixes several potential null pointer dereference vulnerabilities in the tipc netlink functions. Signed-off-by: NBaozeng Ding <sploving1@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 5月, 2016 1 次提交
-
-
由 Eric Dumazet 提交于
TCP stack can now run from process context. Use read_lock_bh(&sk->sk_callback_lock) variant to restore previous assumption. Fixes: 5413d1ba ("net: do not block BH while processing socket backlog") Fixes: d41a69f1 ("tcp: make tcp_sendmsg() aware of socket backlog") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 5月, 2016 1 次提交
-
-
由 Richard Alpe 提交于
The publication field of the old netlink API should contain the publication key and not the publication reference. Fixes: 44a8ae94 (tipc: convert legacy nl name table dump to nl compat) Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 5月, 2016 1 次提交
-
-
由 Richard Alpe 提交于
Make sure the socket for which the user is listing publication exists before parsing the socket netlink attributes. Prior to this patch a call without any socket caused a NULL pointer dereference in tipc_nl_publ_dump(). Tested-and-reported-by: NBaozeng Ding <sploving1@gmail.com> Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Acked-by: NJon Maloy <jon.maloy@ericsson.cm> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 13 5月, 2016 1 次提交
-
-
由 Jon Paul Maloy 提交于
When an ACTIVATE or data packet is received in a link in state ESTABLISHING, the link does not immediately change state to ESTABLISHED, but does instead return a LINK_UP event to the caller, which will execute the state change in a different lock context. This non-atomic approach incurs a low risk that we may have two LINK_UP events pending simultaneously for the same link, resulting in the final part of the setup procedure being executed twice. The only potential harm caused by this it that we may see two LINK_UP events issued to subsribers of the topology server, something that may cause confusion. This commit eliminates this risk by checking if the link is already up before proceeding with the second half of the setup. Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 5月, 2016 3 次提交
-
-
由 Jon Paul Maloy 提交于
There are two flow control mechanisms in TIPC; one at link level that handles network congestion, burst control, and retransmission, and one at connection level which' only remaining task is to prevent overflow in the receiving socket buffer. In TIPC, the latter task has to be solved end-to-end because messages can not be thrown away once they have been accepted and delivered upwards from the link layer, i.e, we can never permit the receive buffer to overflow. Currently, this algorithm is message based. A counter in the receiving socket keeps track of number of consumed messages, and sends a dedicated acknowledge message back to the sender for each 256 consumed message. A counter at the sending end keeps track of the sent, not yet acknowledged messages, and blocks the sender if this number ever reaches 512 unacknowledged messages. When the missing acknowledge arrives, the socket is then woken up for renewed transmission. This works well for keeping the message flow running, as it almost never happens that a sender socket is blocked this way. A problem with the current mechanism is that it potentially is very memory consuming. Since we don't distinguish between small and large messages, we have to dimension the socket receive buffer according to a worst-case of both. I.e., the window size must be chosen large enough to sustain a reasonable throughput even for the smallest messages, while we must still consider a scenario where all messages are of maximum size. Hence, the current fix window size of 512 messages and a maximum message size of 66k results in a receive buffer of 66 MB when truesize(66k) = 131k is taken into account. It is possible to do much better. This commit introduces an algorithm where we instead use 1024-byte blocks as base unit. This unit, always rounded upwards from the actual message size, is used when we advertise windows as well as when we count and acknowledge transmitted data. The advertised window is based on the configured receive buffer size in such a way that even the worst-case truesize/msgsize ratio always is covered. Since the smallest possible message size (from a flow control viewpoint) now is 1024 bytes, we can safely assume this ratio to be less than four, which is the value we are now using. This way, we have been able to reduce the default receive buffer size from 66 MB to 2 MB with maintained performance. In order to keep this solution backwards compatible, we introduce a new capability bit in the discovery protocol, and use this throughout the message sending/reception path to always select the right unit. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
During neighbor discovery, nodes advertise their capabilities as a bit map in a dedicated 16-bit field in the discovery message header. This bit map has so far only be stored in the node structure on the peer nodes, but we now see the need to keep a copy even in the socket structure. This commit adds this functionality. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
In the refactoring commit d570d864 ("tipc: enqueue arrived buffers in socket in separate function") we did by accident replace the test if (sk->sk_backlog.len == 0) atomic_set(&tsk->dupl_rcvcnt, 0); with if (sk->sk_backlog.len) atomic_set(&tsk->dupl_rcvcnt, 0); This effectively disables the compensation we have for the double receive buffer accounting that occurs temporarily when buffers are moved from the backlog to the socket receive queue. Until now, this has gone unnoticed because of the large receive buffer limits we are applying, but becomes indispensable when we reduce this buffer limit later in this series. We now fix this by inverting the mentioned condition. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 02 5月, 2016 2 次提交
-
-
由 Hamish Martin 提交于
We have observed complete lock up of broadcast-link transmission due to unacknowledged packets never being removed from the 'transmq' queue. This is traced to nodes having their ack field set beyond the sequence number of packets that have actually been transmitted to them. Consider an example where node 1 has sent 10 packets to node 2 on a link and node 3 has sent 20 packets to node 2 on another link. We see examples of an ack from node 2 destined for node 3 being treated as an ack from node 2 at node 1. This leads to the ack on the node 1 to node 2 link being increased to 20 even though we have only sent 10 packets. When node 1 does get around to sending further packets, none of the packets with sequence numbers less than 21 are actually removed from the transmq. To resolve this we reinstate some code lost in commit d999297c ("tipc: reduce locking scope during packet reception") which ensures that only messages destined for the receiving node are processed by that node. This prevents the sequence numbers from getting out of sync and resolves the packet leakage, thereby resolving the broadcast-link transmission lock-ups we observed. While we are aware that this change only patches over a root problem that we still haven't identified, this is a sanity test that it is always legitimate to do. It will remain in the code even after we identify and fix the real problem. Reviewed-by: NChris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: NJohn Thompson <john.thompson@alliedtelesis.co.nz> Signed-off-by: NHamish Martin <hamish.martin@alliedtelesis.co.nz> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
When we are displaying statistics for the first link established between two peers, it will always be presented as STANDBY although it in reality is ACTIVE. This happens because we forget to set the 'active' flag in the link instance at the moment it is established. Although this is a bug, it only has impact on the presentation view of the link, not on its actual functionality. Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 4月, 2016 1 次提交
-
-
由 Dan Carpenter 提交于
This is never called with a NULL "buf" and anyway, we dereference 's' on the lines before so it would Oops before we reach the check. Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 4月, 2016 1 次提交
-
-
由 Parthasarathy Bhuvaragan 提交于
Commit 42b18f60 ("tipc: refactor function tipc_link_timeout()"), introduced a bug which prevents sending of probe messages during link synchronization phase. This leads to hanging links, if the bearer is disabled/enabled after links are up. In this commit, we send the probe messages correctly. Fixes: 42b18f60 ("tipc: refactor function tipc_link_timeout()") Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 18 4月, 2016 1 次提交
-
-
由 Masanari Iida 提交于
This patch fix spelling typos found in printk within various part of the kernel sources. Signed-off-by: NMasanari Iida <standby24x7@gmail.com> Acked-by: NRandy Dunlap <rdunlap@infradead.org> Signed-off-by: NJiri Kosina <jkosina@suse.cz>
-
- 16 4月, 2016 5 次提交
-
-
由 Jon Paul Maloy 提交于
According to the link FSM, a received traffic packet can take a link from state ESTABLISHING to ESTABLISHED, but the link can still not be fully set up in one atomic operation. This means that even if the the very first packet on the link is a traffic packet with sequence number 1 (one), it has to be dropped and retransmitted. This can be avoided if we let the mentioned packet be preceded by a LINK_PROTOCOL/STATE message, which takes up the endpoint before the arrival of the traffic. We add this small feature in this commit. This is a fully compatible change. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
In some link establishment scenarios we see that packet #2 may be sent out before packet #1, forcing the receiver to demand retransmission of the missing packet. This is harmless, but may cause confusion among people tracing the packet flow. Since this is extremely easy to fix, we do so by adding en extra send call to the bearer immediately after the link has come up. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
The function tipc_link_timeout() is unnecessary complex, and can easily be made more readable. We do that with this commit. The only functional change is that we remove a redundant test for whether the broadcast link is up or not. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
When a link is down, it will continuously try to re-establish contact with the peer by sending out a RESET or an ACTIVATE message at each timeout interval. The default value for this interval is currently 375 ms. This is wasteful, and may become a problem in very large clusters with dozens or hundreds of nodes being down simultaneously. We now introduce a simple backoff algorithm for these cases. The first five messages are sent at default rate; thereafter a message is sent only each 16th timer interval. This will cover the vast majority of link recycling cases, since the endpoint starting last will transmit at the higher speed, and the link should normally be established well be before the rate needs to be reduced. The only case where we will see a degradation of link re-establishment times is when the endpoints remain intact, and a glitch in the transmission media is causing the link reset. We will then experience a worst-case re-establishing time of 6 seconds, something we deem acceptable. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
When a link endpoint is going down locally, e.g., because its interface is being stopped, it will spontaneously send out a RESET message to its peer, informing it about this fact. This saves the peer from detecting the failure via probing, and hence gives both speedier and less resource consuming failure detection on the peer side. According to the link FSM, a receiver of a RESET message, ignoring the reason for it, must now consider the sender ready to come back up, and starts periodically sending out ACTIVATE messages to the peer in order to re-establish the link. Also, according to the FSM, the receiver of an ACTIVATE message can now go directly to state ESTABLISHED and start sending regular traffic packets. This is a well-proven and robust FSM. However, in the case of a reboot, there is a small possibilty that link endpoint on the rebooted node may have been re-created with a new bearer identity between the moment it sent its (pre-boot) RESET and the moment it receives the ACTIVATE from the peer. The new bearer identity cannot be known by the peer according to this scenario, since traffic headers don't convey such information. This is a problem, because both endpoints need to know the correct value of the peer's bearer id at any moment in time in order to be able to produce correct link events for their users. The only way to guarantee this is to enforce a full setup message exchange (RESET + ACTIVATE) even after the reboot, since those messages carry the bearer idientity in their header. In this commit we do this by introducing and setting a "stopping" bit in the header of the spontaneously generated RESET messages, informing the peer that the sender will not be immediately ready to re-establish the link. A receiver seeing this bit must act as if this were a locally detected connectivity failure, and hence has to go through a full two- way setup message exchange before any link can be re-established. Although never reported, this problem seems to have always been around. This protocol addition is fully backwards compatible. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 4月, 2016 1 次提交
-
-
由 Parthasarathy Bhuvaragan 提交于
Until now, the requests sent to topology server are queued to a workqueue by the generic server framework. These messages are processed by worker threads and trigger the registered callbacks. To reduce latency on uniprocessor systems, explicit rescheduling is performed using cond_resched() after MAX_RECV_MSG_COUNT(25) messages. This implementation on SMP systems leads to an subscriber refcnt error as described below: When a worker thread yields by calling cond_resched() in a SMP system, a new worker is created on another CPU to process the pending workitem. Sometimes the sleeping thread wakes up before the new thread finishes execution. This breaks the assumption on ordering and being single threaded. The fault is more frequent when MAX_RECV_MSG_COUNT is lowered. If the first thread was processing subscription create and the second thread processing close(), the close request will free the subscriber and the create request oops as follows: [31.224137] WARNING: CPU: 2 PID: 266 at include/linux/kref.h:46 tipc_subscrb_rcv_cb+0x317/0x380 [tipc] [31.228143] CPU: 2 PID: 266 Comm: kworker/u8:1 Not tainted 4.5.0+ #97 [31.228377] Workqueue: tipc_rcv tipc_recv_work [tipc] [...] [31.228377] Call Trace: [31.228377] [<ffffffff812fbb6b>] dump_stack+0x4d/0x72 [31.228377] [<ffffffff8105a311>] __warn+0xd1/0xf0 [31.228377] [<ffffffff8105a3fd>] warn_slowpath_null+0x1d/0x20 [31.228377] [<ffffffffa0098067>] tipc_subscrb_rcv_cb+0x317/0x380 [tipc] [31.228377] [<ffffffffa00a4984>] tipc_receive_from_sock+0xd4/0x130 [tipc] [31.228377] [<ffffffffa00a439b>] tipc_recv_work+0x2b/0x50 [tipc] [31.228377] [<ffffffff81071925>] process_one_work+0x145/0x3d0 [31.246554] ---[ end trace c3882c9baa05a4fd ]--- [31.248327] BUG: spinlock bad magic on CPU#2, kworker/u8:1/266 [31.249119] BUG: unable to handle kernel NULL pointer dereference at 0000000000000428 [31.249323] IP: [<ffffffff81099d0c>] spin_dump+0x5c/0xe0 [31.249323] PGD 0 [31.249323] Oops: 0000 [#1] SMP In this commit, we - rename tipc_conn_shutdown() to tipc_conn_release(). - move connection release callback execution from tipc_close_conn() to a new function tipc_sock_release(), which is executed before we free the connection. Thus we release the subscriber during connection release procedure rather than connection shutdown procedure. Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 4月, 2016 1 次提交
-
-
由 Jon Paul Maloy 提交于
We remove a couple of leftover fields in struct tipc_bearer. Those were used by the old broadcast implementation, and are not needed any longer. There is no functional changes in this commit. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 4月, 2016 2 次提交
-
-
由 Erik Hugne 提交于
If a peer node becomes unavailable, in addition to removing the nametable entries from this node we also need to purge all deferred updates associated with this node. Signed-off-by: NErik Hugne <erik.hugne@gmail.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Erik Hugne 提交于
Nametable updates received from the network that cannot be applied immediately are placed on a defer queue. This queue is global to the TIPC module, which might cause problems when using TIPC in containers. To prevent nametable updates from escaping into the wrong namespace, we make the queue pernet instead. Signed-off-by: NErik Hugne <erik.hugne@gmail.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 4月, 2016 2 次提交
-
-
由 Jon Paul Maloy 提交于
Resetting a bearer/interface, with the consequence of resetting all its pertaining links, is not an atomic action. This becomes particularly evident in very large clusters, where a lot of traffic may happen on the remaining links while we are busy shutting them down. In extreme cases, we may even see links being re-created and re-established before we are finished with the job. To solve this, we now introduce a solution where we temporarily detach the bearer from the interface when the bearer is reset. This inhibits all packet reception, while sending still is possible. For the latter, we use the fact that the device's user pointer now is zero to filter out which packets can be sent during this situation; i.e., outgoing RESET messages only. This filtering serves to speed up the neighbors' detection of the loss event, and saves us from unnecessary probing. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Paul Maloy 提交于
When enabling a bearer we create a 'neigbor discoverer' instance by calling the function tipc_disc_create() before the bearer is actually registered in the list of enabled bearers. Because of this, the very first discovery broadcast message, created by the mentioned function, is lost, since it cannot find any valid bearer to use. Furthermore, the used send function, tipc_bearer_xmit_skb() does not free the given buffer when it cannot find a bearer, resulting in the leak of exactly one send buffer each time a bearer is enabled. This commit fixes this problem by introducing two changes: 1) Instead of attemting to send the discovery message directly, we let tipc_disc_create() return the discovery buffer to the calling function, tipc_enable_bearer(), so that the latter can send it when the enabling sequence is finished. 2) In tipc_bearer_xmit_skb(), as well as in the two other transmit functions at the bearer layer, we now free the indicated buffer or buffer chain when a valid bearer cannot be found. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 15 3月, 2016 1 次提交
-
-
由 Richard Alpe 提交于
Expand headroom further in order to be able to fit the larger IPv6 header. Prior to this patch this caused a skb under panic for certain tipc packets when using IPv6 UDP bearer(s). Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 3月, 2016 1 次提交
-
-
由 Daniel Borkmann 提交于
This patch extends udp_tunnel6_xmit_skb() to pass in the IPv6 flow label from call sites. Currently, there's no such option and it's always set to zero when writing ip6_flow_hdr(). Add a label member to ip_tunnel_key, so that flow-based tunnels via collect metadata frontends can make use of it. vxlan and geneve will be converted to add flow label support separately. Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 3月, 2016 1 次提交
-
-
由 Richard Alpe 提交于
Make the c files less cluttered and enable netlink attributes to be shared between files. Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Acked-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 3月, 2016 6 次提交
-
-
由 Jon Paul Maloy 提交于
Until now, we have kept a pre-allocated protocol message header aggregated into struct tipc_link. Apart from adding unnecessary footprint to the link instances, this requires extra code both to initialize and re-initialize it. We now remove this sub-optimization. This change also makes it possible to clean up the function tipc_build_proto_msg() and remove a couple of small functions that were accessing the mentioned header. In particular, we can replace all occurrences of the local function call link_own_addr(link) with the generic tipc_own_addr(net). Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Parthasarathy Bhuvaragan 提交于
commit 4d5cfcba ('tipc: fix connection abort during subscription cancel'), removes the check for a valid subscription before calling tipc_nametbl_subscribe(). This will lead to a nullptr exception when we process a subscription cancel request. For a cancel request, a null subscription is passed to tipc_nametbl_subscribe() resulting in exception. In this commit, we call tipc_nametbl_subscribe() only for a valid subscription. Fixes: 4d5cfcba ('tipc: fix connection abort during subscription cancel') Reported-by: NAnders Widell <anders.widell@ericsson.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Alpe 提交于
Make sure the user has provided a scope for multicast and link local addresses used locally by a UDP bearer. Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Reviewed-by: NErik Hugne <erik.hugne@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Alpe 提交于
The netlink policy for TIPC_NLA_UDP_LOCAL and TIPC_NLA_UDP_REMOTE is of type binary with a defined length. This causes the policy framework to threat the defined length as maximum length. There is however no protection against a user sending a smaller amount of data. Prior to this patch this wasn't handled which could result in a partially incomplete sockaddr_storage struct containing uninitialized data. In this patch we use nla_memcpy() when copying the user data. This ensures a potential gap at the end is cleared out properly. This was found by Julia with Coccinelle tool. Reported-by: NDaniel Borkmann <daniel@iogearbox.net> Reported-by: NJulia Lawall <julia.lawall@lip6.fr> Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Reviewed-by: NErik Hugne <erik.hugne@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Alpe 提交于
Make sure we have a link before checking if it has been reset or not. Prior to this patch tipc_link_is_reset() could be called with a non existing link, resulting in a null pointer dereference. Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Reviewed-by: NErik Hugne <erik.hugne@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Alpe 提交于
Prior to this patch enabling a IPv4 UDP bearer caused a null pointer dereference in iptunnel_xmit_stats(), when it tried to dereference the net device from the skb. To resolve this we now point the skb device to the net device resolved from the routing table. Fixes: 039f5062 (ip_tunnel: Move stats update to iptunnel_xmit()) Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Reviewed-by: NErik Hugne <erik.hugne@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 3月, 2016 1 次提交
-
-
由 Parthasarathy Bhuvaragan 提交于
reverts commit 94153e36 ("tipc: use existing sk_write_queue for outgoing packet chain") In Commit 94153e36, we assume that we fill & empty the socket's sk_write_queue within the same lock_sock() session. This is not true if the link is congested. During congestion, the socket lock is released while we wait for the congestion to cease. This implementation causes a nullptr exception, if the user space program has several threads accessing the same socket descriptor. Consider two threads of the same program performing the following: Thread1 Thread2 -------------------- ---------------------- Enter tipc_sendmsg() Enter tipc_sendmsg() lock_sock() lock_sock() Enter tipc_link_xmit(), ret=ELINKCONG spin on socket lock.. sk_wait_event() : release_sock() grab socket lock : Enter tipc_link_xmit(), ret=0 : release_sock() Wakeup after congestion lock_sock() skb = skb_peek(pktchain); !! TIPC_SKB_CB(skb)->wakeup_pending = tsk->link_cong; In this case, the second thread transmits the buffers belonging to both thread1 and thread2 successfully. When the first thread wakeup after the congestion it assumes that the pktchain is intact and operates on the skb's in it, which leads to the following exception: [2102.439969] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0 [2102.440074] IP: [<ffffffffa005f330>] __tipc_link_xmit+0x2b0/0x4d0 [tipc] [2102.440074] PGD 3fa3f067 PUD 3fa6b067 PMD 0 [2102.440074] Oops: 0000 [#1] SMP [2102.440074] CPU: 2 PID: 244 Comm: sender Not tainted 3.12.28 #1 [2102.440074] RIP: 0010:[<ffffffffa005f330>] [<ffffffffa005f330>] __tipc_link_xmit+0x2b0/0x4d0 [tipc] [...] [2102.440074] Call Trace: [2102.440074] [<ffffffff8163f0b9>] ? schedule+0x29/0x70 [2102.440074] [<ffffffffa006a756>] ? tipc_node_unlock+0x46/0x170 [tipc] [2102.440074] [<ffffffffa005f761>] tipc_link_xmit+0x51/0xf0 [tipc] [2102.440074] [<ffffffffa006d8ae>] tipc_send_stream+0x11e/0x4f0 [tipc] [2102.440074] [<ffffffff8106b150>] ? __wake_up_sync+0x20/0x20 [2102.440074] [<ffffffffa006dc9c>] tipc_send_packet+0x1c/0x20 [tipc] [2102.440074] [<ffffffff81502478>] sock_sendmsg+0xa8/0xd0 [2102.440074] [<ffffffff81507895>] ? release_sock+0x145/0x170 [2102.440074] [<ffffffff815030d8>] ___sys_sendmsg+0x3d8/0x3e0 [2102.440074] [<ffffffff816426ae>] ? _raw_spin_unlock+0xe/0x10 [2102.440074] [<ffffffff81115c2a>] ? handle_mm_fault+0x6ca/0x9d0 [2102.440074] [<ffffffff8107dd65>] ? set_next_entity+0x85/0xa0 [2102.440074] [<ffffffff816426de>] ? _raw_spin_unlock_irq+0xe/0x20 [2102.440074] [<ffffffff8107463c>] ? finish_task_switch+0x5c/0xc0 [2102.440074] [<ffffffff8163ea8c>] ? __schedule+0x34c/0x950 [2102.440074] [<ffffffff81504e12>] __sys_sendmsg+0x42/0x80 [2102.440074] [<ffffffff81504e62>] SyS_sendmsg+0x12/0x20 [2102.440074] [<ffffffff8164aed2>] system_call_fastpath+0x16/0x1b In this commit, we maintain the skb list always in the stack. Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 2月, 2016 2 次提交
-
-
由 Florian Westphal 提交于
msg.dst_sk needs to be set up with a valid socket because some callbacks later derive the netns from it. Fixes: 263ea09084d172d ("Revert "genl: Add genlmsg_new_unicast() for unicast message allocation") Reported-by: NJon Maloy <maloy@donjonn.com> Bisected-by: NJon Maloy <maloy@donjonn.com> Signed-off-by: NFlorian Westphal <fw@strlen.de> Acked-by Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net> -
由 Jon Paul Maloy 提交于
When the TIPC module is unloaded, we have identified a race condition that allows a node reference counter to go to zero and the node instance being freed before the node timer is finished with accessing it. This leads to occasional crashes, especially in multi-namespace environments. The scenario goes as follows: CPU0:(node_stop) CPU1:(node_timeout) // ref == 2 1: if(!mod_timer()) 2: if (del_timer()) 3: tipc_node_put() // ref -> 1 4: tipc_node_put() // ref -> 0 5: kfree_rcu(node); 6: tipc_node_get(node) 7: // BOOM! We now clean up this functionality as follows: 1) We remove the node pointer from the node lookup table before we attempt deactivating the timer. This way, we reduce the risk that tipc_node_find() may obtain a valid pointer to an instance marked for deletion; a harmless but undesirable situation. 2) We use del_timer_sync() instead of del_timer() to safely deactivate the node timer without any risk that it might be reactivated by the timeout handler. There is no risk of deadlock here, since the two functions never touch the same spinlocks. 3: We remove a pointless tipc_node_get() + tipc_node_put() from the timeout handler. Reported-by: NZhijiang Hu <huzhijiang@gmail.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-