提交 · 2da43c4a1b517d02e71d9611a2242273e7d399ba · openeuler / raspberrypi-kernel

02 7月, 2016 2 次提交

RDS: TCP: make receive path use the rds_conn_path · 2da43c4a

由 Sowmini Varadhan 提交于 6月 30, 2016

The ->sk_user_data contains a pointer to the rds_conn_path
for the socket. Use this consistently in the rds_tcp_data_ready
callbacks to get the rds_conn_path for rds_recv_incoming.
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2da43c4a

RDS: Rework path specific indirections · 226f7a7d

由 Sowmini Varadhan 提交于 6月 30, 2016

Refactor code to avoid separate indirections for single-path
and multipath transports. All transports (both single and mp-capable)
will get a pointer to the rds_conn_path, and can trivially derive
the rds_connection from the ->cp_conn.
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

226f7a7d

15 6月, 2016 11 次提交

RDS: Update rds_conn_shutdown to work with rds_conn_path · d769ef81

由 Sowmini Varadhan 提交于 6月 13, 2016

This commit changes rds_conn_shutdown to take a rds_conn_path *
argument, allowing it to shutdown paths other than c_path[0] for
MP-capable transports.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d769ef81

RDS: Add rds_conn_path_error() · fb1b3dc4

由 Sowmini Varadhan 提交于 6月 13, 2016

rds_conn_path_error() is the MP-aware analog of rds_conn_error,
to be used by multipath-capable callers.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fb1b3dc4

RDS: Add rds_conn_path_connect_if_down() for MP-aware callers · 3c0a5900

由 Sowmini Varadhan 提交于 6月 13, 2016

rds_conn_path_connect_if_down() works on the rds_conn_path
that it is passed. Callers who are not t_m_capable may continue
calling rds_conn_connect_if_down, which will invoke
rds_conn_path_connect_if_down() with the default c_path[0].
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3c0a5900

RDS: Make rds_send_pong() take a rds_conn_path argument · 45997e9e

由 Sowmini Varadhan 提交于 6月 13, 2016

This commit allows rds_send_pong() callers to send back
the rds pong message on some path other than c_path[0] by
passing in a struct rds_conn_path * argument.  It also
removes the last dependency on the #defines in rds_single.h
from send.c
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

45997e9e

RDS: Pass rds_conn_path to rds_send_xmit() · 1f9ecd7e

由 Sowmini Varadhan 提交于 6月 13, 2016

Pass a struct rds_conn_path to rds_send_xmit so that MP capable
transports can transmit packets on something other than c_path[0].
The eventual goal for MP capable transports is to hash the rds
socket to a path based on the bound local address/port, and use
this path as the argument to rds_send_xmit()
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1f9ecd7e

RDS: Remove stale function rds_send_get_message() · 7d885d0f

由 Sowmini Varadhan 提交于 6月 13, 2016

The only caller of rds_send_get_message() was
rds_iw_send_cq_comp_handler() which was removed as part of
commit dcdede04 ("RDS: Drop stale iWARP RDMA transport"),
so remove rds_send_get_message() for the same reason.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7d885d0f

RDS: Add rds_send_path_drop_acked() · 5c3d274c

由 Sowmini Varadhan 提交于 6月 13, 2016

rds_send_path_drop_acked() is the path-specific version of
rds_send_drop_acked() to be invoked by MP capable callers.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c3d274c

RDS: rds_inc_path_init() helper function for MP capable transports · 5e833e02

由 Sowmini Varadhan 提交于 6月 13, 2016

t_mp_capable transports can use rds_inc_path_init to initialize
all fields in struct rds_incoming, including the i_conn_path.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5e833e02

RDS: recv path gets the conn_path from rds_incoming for MP capable transports · ef9e62c2

由 Sowmini Varadhan 提交于 6月 13, 2016

Transports that are t_mp_capable should set the rds_conn_path
on which the datagram was recived in the ->i_conn_path field
of struct rds_incoming.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ef9e62c2

RDS: add t_mp_capable bit to be set by MP capable transports · 7e8f4413

由 Sowmini Varadhan 提交于 6月 13, 2016

The t_mp_capable bit will be used in the core rds module
to support multipathing logic when the transport supports it.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e8f4413

RDS: split out connection specific state from rds_connection to rds_conn_path · 0cb43965

由 Sowmini Varadhan 提交于 6月 13, 2016

In preparation for multipath RDS, split the rds_connection
structure into a base structure, and a per-path struct rds_conn_path.
The base structure tracks information and locks common to all
paths. The workqs for send/recv/shutdown etc are tracked per
rds_conn_path. Thus the workq callbacks now work with rds_conn_path.

This commit allows for one rds_conn_path per rds_connection, and will
be extended into multiple conn_paths in subsequent commits.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0cb43965

08 6月, 2016 1 次提交

RDS: TCP: fix race windows in send-path quiescence by rds_tcp_accept_one() · 9c79440e

由 Sowmini Varadhan 提交于 6月 04, 2016

The send path needs to be quiesced before resetting callbacks from
rds_tcp_accept_one(), and commit eb192840 ("RDS:TCP: Synchronize
rds_tcp_accept_one with rds_send_xmit when resetting t_sock") achieves
this using the c_state and RDS_IN_XMIT bit following the pattern
used by rds_conn_shutdown(). However this leaves the possibility
of a race window as shown in the sequence below
    take t_conn_lock in rds_tcp_conn_connect
    send outgoing syn to peer
    drop t_conn_lock in rds_tcp_conn_connect
    incoming from peer triggers rds_tcp_accept_one, conn is
	marked CONNECTING
    wait for RDS_IN_XMIT to quiesce any rds_send_xmit threads
    call rds_tcp_reset_callbacks
    [.. race-window where incoming syn-ack can cause the conn
	to be marked UP from rds_tcp_state_change ..]
    lock_sock called from rds_tcp_reset_callbacks, and we set
	t_sock to null
As soon as the conn is marked UP in the race-window above, rds_send_xmit()
threads will proceed to rds_tcp_xmit and may encounter a null-pointer
deref on the t_sock.

Given that rds_tcp_state_change() is invoked in softirq context, whereas
rds_tcp_reset_callbacks() is in workq context, and testing for RDS_IN_XMIT
after lock_sock could result in a deadlock with tcp_sendmsg, this
commit fixes the race by using a new c_state, RDS_TCP_RESETTING, which
will prevent a transition to RDS_CONN_UP from rds_tcp_state_change().
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c79440e

03 3月, 2016 1 次提交

RDS: Add support for SO_TIMESTAMP for incoming messages · 5711f8b3

由 santosh.shilimkar@oracle.com 提交于 3月 01, 2016

The SO_TIMESTAMP generates time stamp for each incoming RDS messages
User app can enable it by using SO_TIMESTAMP setsocketopt() at
SOL_SOCKET level. CMSG data of cmsg type SO_TIMESTAMP contains the
time stamp in struct timeval format.
Reviewed-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5711f8b3

03 11月, 2015 1 次提交

RDS: convert bind hash table to re-sizable hashtable · 7b565434

由 santosh.shilimkar@oracle.com 提交于 10月 30, 2015

To further improve the RDS connection scalabilty on massive systems
where number of sockets grows into tens of thousands of sockets, there
is a need of larger bind hashtable. Pre-allocated 8K or 16K table is
not very flexible in terms of memory utilisation. The rhashtable
infrastructure gives us the flexibility to grow the hashtbable based
on use and also comes up with inbuilt efficient bucket(chain) handling.
Reviewed-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7b565434

05 10月, 2015 1 次提交

RDS: Use a single TCP socket for both send and receive. · 3b20fc38

由 Sowmini Varadhan 提交于 9月 30, 2015

Commit f711a6ae ("net/rds: RDS-TCP: Always create a new rds_sock
for an incoming connection.") modified rds-tcp so that an incoming SYN
would ignore an existing "client" TCP connection which had the local
port set to the transient port. The motivation for ignoring the existing
"client" connection in f711a6ae was to avoid race conditions and an
endless duel of reconnect attempts triggered by a restart/abort of one
of the nodes in the TCP connection.

However, having separate sockets for active and passive sides
is avoidable, and the simpler model of a single TCP socket for
both send and receives of all RDS connections associated with
that tcp socket makes for easier observability. We avoid the race
conditions from f711a6ae by attempting reconnects in rds_conn_shutdown
if, and only if, the (new) c_outgoing bit is set for RDS_TRANS_TCP.
The c_outgoing bit is initialized in __rds_conn_create().

A side-effect of re-using the client rds_connection for an incoming
SYN is the potential of encountering duelling SYNs, i.e., we
have an outgoing RDS_CONN_CONNECTING socket when we get the incoming
SYN. The logic to arbitrate this criss-crossing SYN exchange in
rds_tcp_accept_one() has been modified to emulate the BGP state
machine: the smaller IP address should back off from the connection attempt.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b20fc38

01 10月, 2015 1 次提交

RDS: Use per-bucket rw lock for bind hash-table · 9b9acde7

由 Santosh Shilimkar 提交于 2月 11, 2014

One global lock protecting hash-tables with 1024 buckets isn't
efficient and it shows up in a massive systems with truck
loads of RDS sockets serving multiple databases. The
perf data clearly highlights the contention on the rw
lock in these massive workloads.

When the contention gets worse, the code gets into a state where
it decides to back off on the lock. So while it has disabled interrupts,
it sits and backs off on this lock get. This causes the system to
become sluggish and eventually all sorts of bad things happen.

The simple fix is to move the lock into the hash bucket and
use per-bucket lock to improve the scalability.
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>

9b9acde7

26 8月, 2015 1 次提交

RDS: make sure we post recv buffers · 73ce4317

由 santosh.shilimkar@oracle.com 提交于 8月 22, 2015

If we get an ENOMEM during rds_ib_recv_refill, we might never come
back and refill again later. Patch makes sure to kick krdsd into
helping out.

To achieve this we add RDS_RECV_REFILL flag and update in the refill
path based on that so that at least some therad will keep posting
receive buffers.

Since krdsd and softirq both might race for refill, we decide to
schedule on work queue based on ring_low instead of ring_empty.
Reviewed-by: NAjaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Signed-off-by: NSantosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

73ce4317

08 8月, 2015 1 次提交

RDS-TCP: Make RDS-TCP work correctly when it is set up in a netns other than init_net · d5a8ac28

由 Sowmini Varadhan 提交于 8月 05, 2015

Open the sockets calling sock_create_kern() with the correct struct net
pointer, and use that struct net pointer when verifying the
address passed to rds_bind().
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d5a8ac28

02 6月, 2015 1 次提交

rds: re-entry of rds_ib_xmit/rds_iw_xmit · d655a9fb

由 Wengang Wang 提交于 5月 21, 2015

The BUG_ON at line 452/453 is triggered in function rds_send_xmit.

441 while (ret) {
442 tmp = min_t(int, ret, sg->length -
443 conn->c_xmit_data_off);
444 conn->c_xmit_data_off += tmp;
445 ret -= tmp;
446 if (conn->c_xmit_data_off == sg->length) {
447 conn->c_xmit_data_off = 0;
448 sg++;
449 conn->c_xmit_sg++;
450 if (ret != 0 && conn->c_xmit_sg == rm->data.op_nents)
451 printk(KERN_ERR "conn %p rm %p sg %p ret %d\n", conn, rm, sg, ret);
452 BUG_ON(ret != 0 &&
453 conn->c_xmit_sg == rm->data.op_nents);
454 }
455 }

it is complaining the total sent length is bigger that we want to send.

rds_ib_xmit() is wrong for the second entry for the same rds_message returning
wrong value.

the sg and off passed by rds_send_xmit to rds_ib_xmit is based on
scatterlist.offset/length, but the rds_ib_xmit action is based on
scatterlist.dma_address/dma_length. in case dma_length is larger than length
there is problem. for the 2nd and later entries of rds_ib_xmit for same
rds_message, at least one of the following two is wrong:

1) the scatterlist to start with, the choosen one can far beyond the correct
one.
2) the offset to start with within the scatterlist.

fix:
add op_dmasg and op_dmaoff to rm_data_op structure indicating the scatterlist
and offset within the it to start with for rds_ib_xmit respectively. op_dmasg
and op_dmaoff are initialized to zero when doing dma mapping for the first see
of the message and are changed when filling send slots.

the same applies to rds_iw_xmit too.
Signed-off-by: NWengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

d655a9fb

01 6月, 2015 2 次提交

net/rds: Add setsockopt support for SO_RDS_TRANSPORT · d97dac54

由 Sowmini Varadhan 提交于 5月 29, 2015

An application may deterministically attach the underlying transport for
a PF_RDS socket by invoking setsockopt(2) with the SO_RDS_TRANSPORT
option at the SOL_RDS level. The integer argument to setsockopt must be
one of the RDS_TRANS_* transport types, e.g., RDS_TRANS_TCP. The option
must be specified before invoking bind(2) on the socket, and may only
be used once on the socket. An attempt to set the option on a bound
socket, or to invoke the option after a successful SO_RDS_TRANSPORT
attachment, will return EOPNOTSUPP.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d97dac54

net/rds: Declare SO_RDS_TRANSPORT and RDS_TRANS_* constants in uapi/linux/rds.h · a28c257c

由 Sowmini Varadhan 提交于 5月 29, 2015

User space applications that desire to explicitly select the
underlying transport for a PF_RDS socket may do so by using the
SO_RDS_TRANSPORT socket option at the SOL_RDS level before bind().
The integer argument provided to the socket option would be one
of the RDS_TRANS_* values, e.g., RDS_TRANS_TCP. This commit exports
the constant values need by such applications via <linux/rds.h>
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a28c257c

19 5月, 2015 1 次提交

RDS: Switch to generic logging helpers · 3c88f3dc

由 Sagi Grimberg 提交于 5月 18, 2015

Signed-off-by: NSagi Grimberg <sagig@mellanox.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>

3c88f3dc

09 4月, 2015 1 次提交

RDS: make sure not to loop forever inside rds_send_xmit · 443be0e5

由 Sowmini Varadhan 提交于 4月 08, 2015

If a determined set of concurrent senders keep the send queue full,
we can loop forever inside rds_send_xmit.  This fix has two parts.

First we are dropping out of the while(1) loop after we've processed a
large batch of messages.

Second we add a generation number that gets bumped each time the
xmit bit lock is acquired.  If someone else has jumped in and
made progress in the queue, we skip our goto restart.

Original patch by Chris Mason.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

443be0e5

03 3月, 2015 1 次提交

net: Remove iocb argument from sendmsg and recvmsg · 1b784140

由 Ying Xue 提交于 3月 02, 2015

After TIPC doesn't depend on iocb argument in its internal
implementations of sendmsg() and recvmsg() hooks defined in proto
structure, no any user is using iocb argument in them at all now.
Then we can drop the redundant iocb argument completely from kinds of
implementations of both sendmsg() and recvmsg() in the entire
networking stack.

Cc: Christoph Hellwig <hch@lst.de>
Suggested-by: NAl Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b784140

24 11月, 2014 2 次提交
- A
  rds: switch rds_message_copy_from_user() to iov_iter · 083735f4
  由 Al Viro 提交于 11月 20, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  083735f4
- A
  rds: switch ->inc_copy_to_user() to passing iov_iter · c310e72c
  由 Al Viro 提交于 11月 20, 2014
```
instances get considerably simpler from that...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  c310e72c
20 10月, 2013 1 次提交

net: misc: Remove extern from function prototypes · c1b1203d

由 Joe Perches 提交于 10月 18, 2013

There are a mix of function prototypes with and without extern
in the kernel sources.  Standardize on not using extern for
function prototypes.

Function prototypes don't need to be written with extern.
extern is assumed by the compiler.  Its use is as unnecessary as
using auto to declare automatic/local variables in a block.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c1b1203d

20 3月, 2012 1 次提交
- C
  rds: remove the second argument of k[un]map_atomic() · 6114eab5
  由 Cong Wang 提交于 11月 25, 2011
```
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NCong Wang <amwang@redhat.com>
```
  6114eab5
01 11月, 2011 1 次提交

treewide: use __printf not __attribute__((format(printf,...))) · b9075fa9

由 Joe Perches 提交于 10月 31, 2011

Standardize the style for compiler based printf format verification.
Standardized the location of __printf too.

Done via script and a little typing.

$ grep -rPl --include=*.[ch] -w "__attribute__" * | \
  grep -vP "^(tools|scripts|include/linux/compiler-gcc.h)" | \
  xargs perl -n -i -e 'local $/; while (<>) { s/\b__attribute__\s*\(\s*\(\s*format\s*\(\s*printf\s*,\s*(.+)\s*,\s*(.+)\s*\)\s*\)\s*\)/__printf($1, $2)/g ; print; }'

[akpm@linux-foundation.org: revert arch bits]
Signed-off-by: NJoe Perches <joe@perches.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b9075fa9

20 1月, 2011 1 次提交

net: cleanup unused macros in net directory · 441c793a

由 Shan Wei 提交于 1月 13, 2011

Clean up some unused macros in net/*.
1. be left for code change. e.g. PGV_FROM_VMALLOC, PGV_FROM_VMALLOC, KMEM_SAFETYZONE.
2. never be used since introduced to kernel.
   e.g. P9_RDMA_MAX_SGE, UTIL_CTRL_PKT_SIZE.
Signed-off-by: NShan Wei <shanwei@cn.fujitsu.com>
Acked-by: NSjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

441c793a

21 10月, 2010 1 次提交

rds: make local functions/variables static · ff51bf84

由 stephen hemminger 提交于 10月 19, 2010

The RDS protocol has lots of functions that should be
declared static. rds_message_get/add_version_extension is
removed since it defined but never used.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff51bf84

09 9月, 2010 7 次提交

RDS: Implement masked atomic operations · 20c72bd5

由 Andy Grover 提交于 8月 25, 2010

Add two CMSGs for masked versions of cswp and fadd. args
struct modified to use a union for different atomic op type's
arguments. Change IB to do masked atomic ops. Atomic op type
in rds_message similarly unionized.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

20c72bd5

RDS/IB: print string constants in more places · 59f740a6

由 Zach Brown 提交于 8月 03, 2010

This prints the constant identifier for work completion status and rdma
cm event types, like we already do for IB event types.

A core string array helper is added that each string type uses.
Signed-off-by: NZach Brown <zach.brown@oracle.com>

59f740a6

RDS: have sockets get transport module references · 5adb5bc6

由 Zach Brown 提交于 7月 23, 2010

Right now there's nothing to stop the various paths that use
rs->rs_transport from racing with rmmod and executing freed transport
code. The simple fix is to have binding to a transport also hold a
reference to the transport's module, removing this class of races.

We already had an unused t_owner field which was set for the modular
transports and which wasn't set for the built-in loop transport.
Signed-off-by: NZach Brown <zach.brown@oracle.com>

5adb5bc6

RDS: remove old rs_transport comment · 77510481

由 Zach Brown 提交于 7月 21, 2010

rs_transport is now also used by the rdma paths once the socket is
bound. We don't need this stale comment to tell us what cscope can.
Signed-off-by: NZach Brown <zach.brown@oracle.com>

77510481

RDS: remove __init and __exit annotation · ef87b7ea

由 Zach Brown 提交于 7月 09, 2010

The trivial amount of memory saved isn't worth the cost of dealing with section
mismatches.
Signed-off-by: NZach Brown <zach.brown@oracle.com>

ef87b7ea

rds: fix rds_send_xmit() serialization · 0f4b1c7e

由 Zach Brown 提交于 6月 04, 2010

rds_send_xmit() was changed to hold an interrupt masking spinlock instead of a
mutex so that it could be called from the IB receive tasklet path. This broke
the TCP transport because its xmit method can block and masks and unmasks
interrupts.

This patch serializes callers to rds_send_xmit() with a simple bit instead of
the current spinlock or previous mutex. This enables rds_send_xmit() to be
called from any context and to call functions which block. Getting rid of the
c_send_lock exposes the bare c_lock acquisitions which are changed to block
interrupts.

A waitqueue is added so that rds_conn_shutdown() can wait for callers to leave
rds_send_xmit() before tearing down partial send state. This lets us get rid
of c_senders.

rds_send_xmit() is changed to check the conn state after acquiring the
RDS_IN_XMIT bit to resolve races with the shutdown path. Previously both
worked with the conn state and then the lock in the same order, allowing them
to race and execute the paths concurrently.

rds_send_reset() isn't racing with rds_send_xmit() now that rds_conn_shutdown()
properly ensures that rds_send_xmit() can't start once the conn state has been
changed. We can remove its previous use of the spinlock.

Finally, c_send_generation is redundant. Callers can race to test the c_flags
bit by simply retrying instead of racing to test the c_send_generation atomic.
Signed-off-by: NZach Brown <zach.brown@oracle.com>

0f4b1c7e

rds: remove unused rds_send_acked_before() · 671202f3

由 Zach Brown 提交于 6月 04, 2010

rds_send_acked_before() wasn't blocking interrupts when acquiring c_lock from
user context but nothing calls it.  Rather than fix its use of c_lock we just
remove the function.
Signed-off-by: NZach Brown <zach.brown@oracle.com>

671202f3