提交 · 5ca114400dcd46f19f31573e7c60e638bd8d644b · openanolis / cloud-kernel

23 1月, 2018 1 次提交

rds: tcp: compute m_ack_seq as offset from ->write_seq · b589513e

由 Sowmini Varadhan 提交于 1月 18, 2018

rds-tcp uses m_ack_seq to track the tcp ack# that indicates
that the peer has received a rds_message. The m_ack_seq is
used in rds_tcp_is_acked() to figure out when it is safe to
drop the rds_message from the RDS retransmit queue.

The m_ack_seq must be calculated as an offset from the right
edge of the in-flight tcp buffer, i.e., it should be based on
the ->write_seq, not the ->snd_nxt.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b589513e

06 1月, 2018 1 次提交

rds: use RCU to synchronize work-enqueue with connection teardown · 3db6e0d1

由 Sowmini Varadhan 提交于 1月 04, 2018

rds_sendmsg() can enqueue work on cp_send_w from process context, but
it should not enqueue this work if connection teardown  has commenced
(else we risk enquing work after rds_conn_path_destroy() has assumed that
all work has been cancelled/flushed).

Similarly some other functions like rds_cong_queue_updates
and rds_tcp_data_ready are called in softirq context, and may end
up enqueuing work on rds_wq after rds_conn_path_destroy() has assumed
that all workqs are quiesced.

Check the RDS_DESTROY_PENDING bit and use rcu synchronization to avoid
all these races.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3db6e0d1

17 7月, 2017 1 次提交

rds: cancel send/recv work before queuing connection shutdown · aed20a53

由 Sowmini Varadhan 提交于 7月 16, 2017

We could end up executing rds_conn_shutdown before the rds_recv_worker
thread, then rds_conn_shutdown -> rds_tcp_conn_shutdown can do a
sock_release and set sock->sk to null, which may interleave in bad
ways with rds_recv_worker, e.g., it could result in:

"BUG: unable to handle kernel NULL pointer dereference at 0000000000000078"
    [ffff881769f6fd70] release_sock at ffffffff815f337b
    [ffff881769f6fd90] rds_tcp_recv at ffffffffa043c888 [rds_tcp]
    [ffff881769f6fdb0] rds_recv_worker at ffffffffa04a4810 [rds]
    [ffff881769f6fde0] process_one_work at ffffffff810a14c1
    [ffff881769f6fe40] worker_thread at ffffffff810a1940
    [ffff881769f6fec0] kthread at ffffffff810a6b1e

Also, do not enqueue any new shutdown workq items when the connection is
shutting down (this may happen for rds-tcp in softirq mode, if a FIN
or CLOSE is received while the modules is in the middle of an unload)
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aed20a53

01 7月, 2017 1 次提交

net: convert sock.sk_wmem_alloc from atomic_t to refcount_t · 14afee4b

由 Reshetova, Elena 提交于 6月 30, 2017

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDavid Windsor <dwindsor@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

14afee4b

06 4月, 2017 1 次提交
- A
  don't open-code kernel_setsockopt() · e73a67f7
  由 Al Viro 提交于 3月 18, 2017
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  e73a67f7
18 11月, 2016 1 次提交

RDS: TCP: set RDS_FLAG_RETRANSMITTED in cp_retrans list · 315ca6d9

由 Sowmini Varadhan 提交于 11月 16, 2016

As noted in rds_recv_incoming() sequence numbers on data packets
can decreas for the failover case, and the Rx path is equipped
to recover from this, if the RDS_FLAG_RETRANSMITTED is set
on the rds header of an incoming message with a suspect sequence
number.

The RDS_FLAG_RETRANSMITTED is predicated on the RDS_FLAG_RETRANSMITTED
flag in the rds_message, so make sure the flag is set on messages
queued for retransmission.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

315ca6d9

16 7月, 2016 1 次提交

RDS: TCP: Enable multipath RDS for TCP · 5916e2c1

由 Sowmini Varadhan 提交于 7月 14, 2016

Use RDS probe-ping to compute how many paths may be used with
the peer, and to synchronously start the multiple paths. If mprds is
supported, hash outgoing traffic to one of multiple paths in rds_sendmsg()
when multipath RDS is supported by the transport.

CC: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5916e2c1

02 7月, 2016 2 次提交

RDS: TCP: make ->sk_user_data point to a rds_conn_path · ea3b1ea5

由 Sowmini Varadhan 提交于 6月 30, 2016

The socket callbacks should all operate on a struct rds_conn_path,
in preparation for a MP capable RDS-TCP.
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea3b1ea5

RDS: Rework path specific indirections · 226f7a7d

由 Sowmini Varadhan 提交于 6月 30, 2016

Refactor code to avoid separate indirections for single-path
and multipath transports. All transports (both single and mp-capable)
will get a pointer to the rds_conn_path, and can trivially derive
the rds_connection from the ->cp_conn.
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

226f7a7d

19 6月, 2016 1 次提交

net: rds: fix coding style issues · 5c3da57d

由 Joshua Houghton 提交于 6月 18, 2016

Fix coding style issues in the following files:

ib_cm.c:      add space
loop.c:       convert spaces to tabs
sysctl.c:     add space
tcp.h:        convert spaces to tabs
tcp_connect.c:remove extra indentation in switch statement
tcp_recv.c:   convert spaces to tabs
tcp_send.c:   convert spaces to tabs
transport.c:  move brace up one line on for statement
Signed-off-by: NJoshua Houghton <josh@awful.name>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c3da57d

15 6月, 2016 1 次提交

RDS: split out connection specific state from rds_connection to rds_conn_path · 0cb43965

由 Sowmini Varadhan 提交于 6月 13, 2016

In preparation for multipath RDS, split the rds_connection
structure into a base structure, and a per-path struct rds_conn_path.
The base structure tracks information and locks common to all
paths. The workqs for send/recv/shutdown etc are tracked per
rds_conn_path. Thus the workq callbacks now work with rds_conn_path.

This commit allows for one rds_conn_path per rds_connection, and will
be extended into multiple conn_paths in subsequent commits.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0cb43965

20 5月, 2016 1 次提交

rds: tcp: block BH in TCP callbacks · 38036629

由 Eric Dumazet 提交于 5月 17, 2016

TCP stack can now run from process context.

Use read_lock_bh(&sk->sk_callback_lock) variant to restore previous
assumption.

Fixes: 5413d1ba ("net: do not block BH while processing socket backlog")
Fixes: d41a69f1 ("tcp: make tcp_sendmsg() aware of socket backlog")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

38036629

05 10月, 2015 1 次提交

RDS-TCP: Set up MSG_MORE and MSG_SENDPAGE_NOTLAST as appropriate in rds_tcp_xmit · 76b29ef1

由 Sowmini Varadhan 提交于 9月 30, 2015

For the same reasons as commit 2f533844 ("tcp: allow splice() to
build full TSO packets") and commit 35f9c09f ("tcp: tcp_sendpages()
should call tcp_push() once"), rds_tcp_xmit may have multiple pages to
send, so use the MSG_MORE and MSG_SENDPAGE_NOTLAST as hints to
tcp_sendpage()
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76b29ef1

18 4月, 2014 1 次提交

arch: Mass conversion of smp_mb__*() · 4e857c58

由 Peter Zijlstra 提交于 3月 17, 2014

Mostly scripted conversion of the smp_mb__* barriers.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: NIngo Molnar <mingo@kernel.org>

4e857c58

23 8月, 2012 1 次提交

rds: Don't disable BH on BH context · bfdc587c

由 Ying Xue 提交于 8月 19, 2012

Since we have already in BH context when *_write_space(),
*_data_ready() as well as *_state_change() are called, it's
unnecessary to disable BH.
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

bfdc587c

21 10月, 2010 1 次提交

rds: make local functions/variables static · ff51bf84

由 stephen hemminger 提交于 10月 19, 2010

The RDS protocol has lots of functions that should be
declared static. rds_message_get/add_version_extension is
removed since it defined but never used.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff51bf84

25 9月, 2010 1 次提交

net: fix a lockdep splat · f064af1e

由 Eric Dumazet 提交于 9月 22, 2010

We have for each socket :

One spinlock (sk_slock.slock)
One rwlock (sk_callback_lock)

Possible scenarios are :

(A) (this is used in net/sunrpc/xprtsock.c)
read_lock(&sk->sk_callback_lock) (without blocking BH)
<BH>
spin_lock(&sk->sk_slock.slock);
...
read_lock(&sk->sk_callback_lock);
...

(B)
write_lock_bh(&sk->sk_callback_lock)
stuff
write_unlock_bh(&sk->sk_callback_lock)

(C)
spin_lock_bh(&sk->sk_slock)
...
write_lock_bh(&sk->sk_callback_lock)
stuff
write_unlock_bh(&sk->sk_callback_lock)
spin_unlock_bh(&sk->sk_slock)

This (C) case conflicts with (A) :

CPU1 [A]                         CPU2 [C]
read_lock(callback_lock)
<BH>                             spin_lock_bh(slock)
<wait to spin_lock(slock)>
                                 <wait to write_lock_bh(callback_lock)>

We have one problematic (C) use case in inet_csk_listen_stop() :

local_bh_disable();
bh_lock_sock(child); // spin_lock_bh(&sk->sk_slock)
WARN_ON(sock_owned_by_user(child));
...
sock_orphan(child); // write_lock_bh(&sk->sk_callback_lock)

lockdep is not happy with this, as reported by Tetsuo Handa

It seems only way to deal with this is to use read_lock_bh(callbacklock)
everywhere.

Thanks to Jarek for pointing a bug in my first attempt and suggesting
this solution.
Reported-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Jarek Poplawski <jarkao2@gmail.com>
Tested-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f064af1e

09 9月, 2010 4 次提交

RDS: Stop supporting old cong map sending method · 77dd550e

由 Andy Grover 提交于 3月 22, 2010

We now ask the transport to give us a rm for the congestion
map, and then we handle it normally. Previously, the
transport defined a function that we would call to send
a congestion map.

Convert TCP and loop transports to new cong map method.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

77dd550e

A
RDS: Rename data op members prefix from m_ to op_ · 6c7cc6e4
由 Andy Grover 提交于 1月 27, 2010
```
For consistency.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
6c7cc6e4

RDS: break out rdma and data ops into nested structs in rds_message · e779137a

由 Andy Grover 提交于 1月 12, 2010

Clearly separate rdma-related variables in rm from data-related ones.
This is in anticipation of adding atomic support.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

e779137a

A
RDS: cleanup: remove "== NULL"s and "!= NULL"s in ptr comparisons · 8690bfa1
由 Andy Grover 提交于 1月 12, 2010
```
Favor "if (foo)" style over "if (foo != NULL)".
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
8690bfa1

17 3月, 2010 1 次提交

RDS/TCP: Wait to wake thread when write space available · 8e82376e

由 Andy Grover 提交于 3月 11, 2010

Instead of waking the send thread whenever any send space is available,
wait until it is at least half empty. This is modeled on how
sock_def_write_space() does it, and may help to minimize context
switches.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8e82376e

04 2月, 2010 1 次提交

net/rds: remove uses of NIPQUAD, use %pI4 · 6884b348

由 Joe Perches 提交于 2月 02, 2010

Signed-off-by: NJoe Perches <joe@perches.com>
Cc: Andy Grover <andy.grover@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6884b348

24 8月, 2009 1 次提交

RDS: Add TCP transport to RDS · 70041088

由 Andy Grover 提交于 8月 21, 2009

This code allows RDS to be tunneled over a TCP connection.

RDMA operations are disabled when using TCP transport,
but this frees RDS from the IB/RDMA stack dependency, and allows
it to be used with standard Ethernet adapters, or in a VM.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70041088

openanolis / cloud-kernel 大约 2 年 前同步成功

openanolis / cloud-kernel
大约 2 年前同步成功