- 15 6月, 2016 5 次提交
-
-
由 Sowmini Varadhan 提交于
rds_send_path_reset() is the path specific version of rds_send_reset() intended for MP capable callers. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
t_mp_capable transports can use rds_inc_path_init to initialize all fields in struct rds_incoming, including the i_conn_path. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
Transports that are t_mp_capable should set the rds_conn_path on which the datagram was recived in the ->i_conn_path field of struct rds_incoming. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
The t_mp_capable bit will be used in the core rds module to support multipathing logic when the transport supports it. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
In preparation for multipath RDS, split the rds_connection structure into a base structure, and a per-path struct rds_conn_path. The base structure tracks information and locks common to all paths. The workqs for send/recv/shutdown etc are tracked per rds_conn_path. Thus the workq callbacks now work with rds_conn_path. This commit allows for one rds_conn_path per rds_connection, and will be extended into multiple conn_paths in subsequent commits. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 6月, 2016 1 次提交
-
-
由 Bhaktipriya Shridhar 提交于
alloc_workqueue replaces deprecated create_workqueue(). Since the driver is infiniband which can be used as block device and the workqueue seems involved in regular operation of the device, so a dedicated workqueue has been used with WQ_MEM_RECLAIM set to guarantee forward progress under memory pressure. Since there are only a fixed number of work items, explicit concurrency limit is unnecessary here. Signed-off-by: NBhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 6月, 2016 3 次提交
-
-
由 Sowmini Varadhan 提交于
The send path needs to be quiesced before resetting callbacks from rds_tcp_accept_one(), and commit eb192840 ("RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock") achieves this using the c_state and RDS_IN_XMIT bit following the pattern used by rds_conn_shutdown(). However this leaves the possibility of a race window as shown in the sequence below take t_conn_lock in rds_tcp_conn_connect send outgoing syn to peer drop t_conn_lock in rds_tcp_conn_connect incoming from peer triggers rds_tcp_accept_one, conn is marked CONNECTING wait for RDS_IN_XMIT to quiesce any rds_send_xmit threads call rds_tcp_reset_callbacks [.. race-window where incoming syn-ack can cause the conn to be marked UP from rds_tcp_state_change ..] lock_sock called from rds_tcp_reset_callbacks, and we set t_sock to null As soon as the conn is marked UP in the race-window above, rds_send_xmit() threads will proceed to rds_tcp_xmit and may encounter a null-pointer deref on the t_sock. Given that rds_tcp_state_change() is invoked in softirq context, whereas rds_tcp_reset_callbacks() is in workq context, and testing for RDS_IN_XMIT after lock_sock could result in a deadlock with tcp_sendmsg, this commit fixes the race by using a new c_state, RDS_TCP_RESETTING, which will prevent a transition to RDS_CONN_UP from rds_tcp_state_change(). Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
When we switch a connection's sockets in rds_tcp_rest_callbacks, any partially sent datagram must be retransmitted on the new socket so that the receiver can correctly reassmble the RDS datagram. Use rds_send_reset() which is designed for this purpose. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
When rds_tcp_accept_one() has to replace the existing tcp socket with a newer tcp socket (duelling-syn resolution), it must lock_sock() to suppress the rds_tcp_data_recv() path while callbacks are being changed. Also, existing RDS datagram reassembly state must be reset, so that the next datagram on the new socket does not have corrupted state. Similarly when resetting the newly accepted socket, appropriate locks and synchronization is needed. This commit ensures correct synchronization by invoking kernel_sock_shutdown to reset a newly accepted sock, and by taking appropriate lock_sock()s (for old and new sockets) when resetting existing callbacks. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 6月, 2016 1 次提交
-
-
由 Kangjie Lu 提交于
The last field "flags" of object "minfo" is not initialized. Copying this object out may leak kernel stack data. Assign 0 to it to avoid leak. Signed-off-by: NKangjie Lu <kjlu@gatech.edu> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 5月, 2016 2 次提交
-
-
由 Sowmini Varadhan 提交于
When a rogue SYN is received after the connection arbitration algorithm has converged, the incoming SYN should not needlessly quiesce the transmit path, and it should not result in needless TCP connection resets due to re-execution of the connection arbitration logic. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
There are two instances where we want to terminate RDS-TCP: when exiting the netns or during module unload. In either case, the termination sequence is to stop the listen socket, mark the rtn->rds_tcp_listen_sock as null, and flush any accept workqs. Thus any workqs that get flushed at this point will encounter a null rds_tcp_listen_sock, and must exit gracefully to allow the RDS-TCP termination to complete successfully. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 5月, 2016 1 次提交
-
-
由 Eric Dumazet 提交于
TCP stack can now run from process context. Use read_lock_bh(&sk->sk_callback_lock) variant to restore previous assumption. Fixes: 5413d1ba ("net: do not block BH while processing socket backlog") Fixes: d41a69f1 ("tcp: make tcp_sendmsg() aware of socket backlog") Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 5月, 2016 1 次提交
-
-
由 Christoph Hellwig 提交于
Signed-off-by: NChristoph Hellwig <hch@lst.de> Tested-by: NSteve Wise <swise@opengridcomputing.com> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: NSagi Grimberg <sagi@grimberg.me> Reviewed-by: NSteve Wise <swise@opengridcomputing.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 04 5月, 2016 2 次提交
-
-
由 Sowmini Varadhan 提交于
An arbitration scheme for duelling SYNs is implemented as part of commit 241b2719 ("RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in rds_tcp_accept_one()") which ensures that both nodes involved will arrive at the same arbitration decision. However, this needs to be synchronized with an outgoing SYN to be generated by rds_tcp_conn_connect(). This commit achieves the synchronization through the t_conn_lock mutex in struct rds_tcp_connection. The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring the t_conn_lock mutex. A SYN is sent out only if the RDS connection is not already UP (an UP would indicate that rds_tcp_accept_one() has completed 3WH, so no SYN needs to be generated). Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after acquiring the t_conn_lock mutex. The only acceptable states (to allow continuation of the arbitration logic) are UP (i.e., outgoing SYN was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent outgoing SYN before we saw incoming SYN). Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
There is a race condition between rds_send_xmit -> rds_tcp_xmit and the code that deals with resolution of duelling syns added by commit 241b2719 ("RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in rds_tcp_accept_one()"). Specifically, we may end up derefencing a null pointer in rds_send_xmit if we have the interleaving sequence: rds_tcp_accept_one rds_send_xmit conn is RDS_CONN_UP, so invoke rds_tcp_xmit tc = conn->c_transport_data rds_tcp_restore_callbacks /* reset t_sock */ null ptr deref from tc->t_sock The race condition can be avoided without adding the overhead of additional locking in the xmit path: have rds_tcp_accept_one wait for rds_tcp_xmit threads to complete before resetting callbacks. The synchronization can be done in the same manner as rds_conn_shutdown(). First set the rds_conn_state to something other than RDS_CONN_UP (so that new threads cannot get into rds_tcp_xmit()), then wait for RDS_IN_XMIT to be cleared in the conn->c_flags indicating that any threads in rds_tcp_xmit are done. Fixes: 241b2719 ("RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in rds_tcp_accept_one()") Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 4月, 2016 1 次提交
-
-
由 Sowmini Varadhan 提交于
rds-stress experiments with request size 256 bytes, 8K acks, using 16 threads show a 40% improvment when pskb_extract() replaces the {skb_clone(..); pskb_pull(..); pskb_trim(..);} pattern in the Rx path, so we leverage the perf gain with this commit. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 4月, 2016 2 次提交
-
-
Two different threads with different rds sockets may be in rds_recv_rcvbuf_delta() via receive path. If their ports both map to the same word in the congestion map, then using non-atomic ops to update it could cause the map to be incorrect. Lets use atomics to avoid such an issue. Full credit to Wengang <wen.gang.wang@oracle.com> for finding the issue, analysing it and also pointing out to offending code with spin lock based fix. Reviewed-by: NLeon Romanovsky <leon@leon.nu> Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Qing Huang 提交于
dp->dp_ack_seq is used in big endian format. We need to do the big endianness conversion when we assign a value in host format to it. Signed-off-by: NQing Huang <qing.huang@oracle.com> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 4月, 2016 2 次提交
-
-
由 shamir rabinovitch 提交于
When PAGE_SIZE > 4k single page can contain 2 RDS fragments. If 'rds_ib_cong_recv' ignore the RDS fragment offset in to the page it then read the data fragment as far congestion map update and lead to corruption of the RDS connection far congestion map. Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 shamir rabinovitch 提交于
Fix issue in 'rds_ib_cong_recv' when accessing unaligned memory allocated by 'rds_page_remainder_alloc' using uint64_t pointer. Signed-off-by: NShamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 3月, 2016 2 次提交
-
-
由 Sowmini Varadhan 提交于
RDS_TCP_DEFAULT_BUFSIZE has been unused since commit 1edd6a14 ("RDS-TCP: Do not bloat sndbuf/rcvbuf in rds_tcp_tune"). Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
Add per-net sysctl tunables to set the size of sndbuf and rcvbuf on the kernel tcp socket. The tunables are added at /proc/sys/net/rds/tcp/rds_tcp_sndbuf and /proc/sys/net/rds/tcp/rds_tcp_rcvbuf. These values must be set before accept() or connect(), and there may be an arbitrary number of existing rds-tcp sockets when the tunable is modified. To make sure that all connections in the netns pick up the same value for the tunable, we reset existing rds-tcp connections in the netns, so that they can reconnect with the new parameters. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 16 3月, 2016 1 次提交
-
-
由 Peter Zijlstra 提交于
$ make tags GEN tags ctags: Warning: drivers/acpi/processor_idle.c:64: null expansion of name pattern "\1" ctags: Warning: drivers/xen/events/events_2l.c:41: null expansion of name pattern "\1" ctags: Warning: kernel/locking/lockdep.c:151: null expansion of name pattern "\1" ctags: Warning: kernel/rcu/rcutorture.c:133: null expansion of name pattern "\1" ctags: Warning: kernel/rcu/rcutorture.c:135: null expansion of name pattern "\1" ctags: Warning: kernel/workqueue.c:323: null expansion of name pattern "\1" ctags: Warning: net/ipv4/syncookies.c:53: null expansion of name pattern "\1" ctags: Warning: net/ipv6/syncookies.c:44: null expansion of name pattern "\1" ctags: Warning: net/rds/page.c:45: null expansion of name pattern "\1" Which are all the result of the DEFINE_PER_CPU pattern: scripts/tags.sh:200: '/\<DEFINE_PER_CPU([^,]*, *\([[:alnum:]_]*\)/\1/v/' scripts/tags.sh:201: '/\<DEFINE_PER_CPU_SHARED_ALIGNED([^,]*, *\([[:alnum:]_]*\)/\1/v/' The below cures them. All except the workqueue one are within reasonable distance of the 80 char limit. TJ do you have any preference on how to fix the wq one, or shall we just not care its too long? Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Acked-by: NDavid S. Miller <davem@davemloft.net> Acked-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Tejun Heo <tj@kernel.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 3月, 2016 12 次提交
-
-
由 Avinash Repaka 提交于
Fastreg MR(FRMR) is another method with which one can register memory to HCA. Some of the newer HCAs supports only fastreg mr mode, so we need to add support for it to have RDS functional on them. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NAvinash Repaka <avinash.repaka@oracle.com> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Fastreg MR(FRMR) memory registration and invalidation makes use of work request and completion queues for its operation. Patch allocates extra queue space towards these operation(s). Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Discovere Fast Memmory Registration support using IB device IB_DEVICE_MEM_MGT_EXTENSIONS. Certain HCA might support just FRMR or FMR or both FMR and FRWR. In case both mr type are supported, default FMR is used. Default MR is still kept as FMR against what everyone else is following. Default will be changed to FRMR once the RDS performance with FRMR is comparable with FMR. The work is in progress for the same. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Add MR reuse statistics to RDS IB transport. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Drop the RDS connection on RDMA_CM_EVENT_TIMEWAIT_EXIT so that it can reconnect and resume. While testing fastreg, this error happened in couple of tests but was getting un-noticed. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Preperatory patch for FRMR support. From connection info, we can retrieve cm_id which contains qp handled needed for work request posting. We also need to drop the RDS connection on QP error states where connection handle becomes useful. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
No functional change. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Keep fmr related filed in its own struct. Fastreg MR structure will be added to the union. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
No functional changes. This is in preperation towards adding fastreg memory resgitration support. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
This helps to combine asynchronous fastreg MR completion handler with send completion handler. No functional change. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
The SO_TIMESTAMP generates time stamp for each incoming RDS messages User app can enable it by using SO_TIMESTAMP setsocketopt() at SOL_SOCKET level. CMSG data of cmsg type SO_TIMESTAMP contains the time stamp in struct timeval format. Reviewed-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
RDS iWarp support code has become stale and non testable. As indicated earlier, am dropping the support for it. If new iWarp user(s) shows up in future, we can adapat the RDS IB transprt for the special RDMA READ sink case. iWarp needs an MR for the RDMA READ sink. Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 2月, 2016 1 次提交
-
-
由 stephen hemminger 提交于
Duplicate include detected. Signed-off-by: NStephen Hemminger <stephen@networkplumber.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 12月, 2015 2 次提交
-
-
It looks like an attempt to use CPU notifier here which was never completed. Nobody tried to wire it up completely since 2k9. So I unwind this code and get rid of everything not required. Oh look! 19 lines were removed while code still does the same thing. Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Tested-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Or Gerlitz 提交于
Instead, use the cached copy of the attributes present on the device. Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com> Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 25 11月, 2015 1 次提交
-
-
由 Quentin Casasnovas 提交于
Sasha's found a NULL pointer dereference in the RDS connection code when sending a message to an apparently unbound socket. The problem is caused by the code checking if the socket is bound in rds_sendmsg(), which checks the rs_bound_addr field without taking a lock on the socket. This opens a race where rs_bound_addr is temporarily set but where the transport is not in rds_bind(), leading to a NULL pointer dereference when trying to dereference 'trans' in __rds_conn_create(). Vegard wrote a reproducer for this issue, so kindly ask him to share if you're interested. I cannot reproduce the NULL pointer dereference using Vegard's reproducer with this patch, whereas I could without. Complete earlier incomplete fix to CVE-2015-6937: 74e98eb0 ("RDS: verify the underlying transport exists before creating a connection") Cc: David S. Miller <davem@davemloft.net> Cc: stable@vger.kernel.org Reviewed-by: NVegard Nossum <vegard.nossum@oracle.com> Reviewed-by: NSasha Levin <sasha.levin@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NQuentin Casasnovas <quentin.casasnovas@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-