提交 · 571e6776add4f499661e761e03e46ec0f6d66243 · openanolis / cloud-kernel

09 3月, 2018 2 次提交

rds: rds_info_from_znotifier() can be static · 571e6776

由 kbuild test robot 提交于 3月 08, 2018

Fixes: 9426bbc6 ("rds: use list structure to track information for zerocopy completion notification")
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

571e6776

rds: rds_message_zcopy_from_user() can be static · 496c7f3c

由 kbuild test robot 提交于 3月 08, 2018

Fixes: d40a126b ("rds: refactor zcopy code into rds_message_zcopy_from_user")
Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

496c7f3c

08 3月, 2018 2 次提交

rds: use list structure to track information for zerocopy completion notification · 9426bbc6

由 Sowmini Varadhan 提交于 3月 06, 2018

Commit 401910db ("rds: deliver zerocopy completion notification
with data") removes support fo r zerocopy completion notification
on the sk_error_queue, thus we no longer need to track the cookie
information in sk_buff structures.

This commit removes the struct sk_buff_head rs_zcookie_queue by
a simpler list that results in a smaller memory footprint as well
as more efficient memory_allocation time.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9426bbc6

rds: refactor zcopy code into rds_message_zcopy_from_user · d40a126b

由 Sowmini Varadhan 提交于 3月 06, 2018

Move the large block of code predicated on zcopy from
rds_message_copy_from_user into a new function,
rds_message_zcopy_from_user()
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d40a126b

28 2月, 2018 1 次提交

rds: deliver zerocopy completion notification with data · 401910db

由 Sowmini Varadhan 提交于 2月 27, 2018

This commit is an optimization over commit 01883eda
("rds: support for zcopy completion notification") for PF_RDS sockets.

RDS applications are predominantly request-response transactions, so
it is more efficient to reduce the number of system calls and have
zerocopy completion notification delivered as ancillary data on the
POLLIN channel.

Cookies are passed up as ancillary data (at level SOL_RDS) in a
struct rds_zcopy_cookies when the returned value of recvmsg() is
greater than, or equal to, 0. A max of RDS_MAX_ZCOOKIES may be passed
with each message.

This commit removes support for zerocopy completion notification on
MSG_ERRQUEUE for PF_RDS sockets.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

401910db

17 2月, 2018 3 次提交

rds: zerocopy Tx support. · 0cebacce

由 Sowmini Varadhan 提交于 2月 15, 2018

If the MSG_ZEROCOPY flag is specified with rds_sendmsg(), and,
if the SO_ZEROCOPY socket option has been set on the PF_RDS socket,
application pages sent down with rds_sendmsg() are pinned.

The pinning uses the accounting infrastructure added by
Commit a91dbff5 ("sock: ulimit on MSG_ZEROCOPY pages")

The payload bytes in the message may not be modified for the
duration that the message has been pinned. A multi-threaded
application using this infrastructure may thus need to be notified
about send-completion so that it can free/reuse the buffers
passed to rds_sendmsg(). Notification of send-completion will
identify each message-buffer by a cookie that the application
must specify as ancillary data to rds_sendmsg().
The ancillary data in this case has cmsg_level == SOL_RDS
and cmsg_type == RDS_CMSG_ZCOPY_COOKIE.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0cebacce

rds: support for zcopy completion notification · 01883eda

由 Sowmini Varadhan 提交于 2月 15, 2018

RDS removes a datagram (rds_message) from the retransmit queue when
an ACK is received. The ACK indicates that the receiver has queued
the RDS datagram, so that the sender can safely forget the datagram.
When all references to the rds_message are quiesced, rds_message_purge
is called to release resources used by the rds_message

If the datagram to be removed had pinned pages set up, add
an entry to the rs->rs_znotify_queue so that the notifcation
will be sent up via rds_rm_zerocopy_callback() when the
rds_message is eventually freed by rds_message_purge.

rds_rm_zerocopy_callback() attempts to batch the number of cookies
sent with each notification  to a max of SO_EE_ORIGIN_MAX_ZCOOKIES.
This is achieved by checking the tail skb in the sk_error_queue:
if this has room for one more cookie, the cookie from the
current notification is added; else a new skb is added to the
sk_error_queue. Every invocation of rds_rm_zerocopy_callback() will
trigger a ->sk_error_report to notify the application.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

01883eda

rds: hold a sock ref from rds_message to the rds_sock · ea8994cb

由 Sowmini Varadhan 提交于 2月 15, 2018

The existing model holds a reference from the rds_sock to the
rds_message, but the rds_message does not itself hold a sock_put()
on the rds_sock. Instead the m_rs field in the rds_message is
assigned when the message is queued on the sock, and nulled when
the message is dequeued from the sock.

We want to be able to notify userspace when the rds_message
is actually freed (from rds_message_purge(), after the refcounts
to the rds_message go to 0). At the time that rds_message_purge()
is called, the message is no longer on the rds_sock retransmit
queue. Thus the explicit reference for the m_rs is needed to
send a notification that will signal to userspace that
it is now safe to free/reuse any pages that may have
been pinned down for zerocopy.

This patch manages the m_rs assignment in the rds_message with
the necessary refcount book-keeping.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ea8994cb

05 7月, 2017 1 次提交

net, rds: convert rds_message.m_refcount from atomic_t to refcount_t · 6c5a1c4a

由 Reshetova, Elena 提交于 7月 04, 2017

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.
Signed-off-by: NElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: NHans Liljestrand <ishkamiel@gmail.com>
Signed-off-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NDavid Windsor <dwindsor@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c5a1c4a

18 11月, 2016 1 次提交

RDS: TCP: Track peer's connection generation number · 905dd418

由 Sowmini Varadhan 提交于 11月 16, 2016

The RDS transport has to be able to distinguish between
two types of failure events:
(a) when the transport fails (e.g., TCP connection reset)
    but the RDS socket/connection layer on both sides stays
    the same
(b) when the peer's RDS layer itself resets (e.g., due to module
    reload or machine reboot at the peer)
In case (a) both sides must reconnect and continue the RDS messaging
without any message loss or disruption to the message sequence numbers,
and this is achieved by rds_send_path_reset().

In case (b) we should reset all rds_connection state to the
new incarnation of the peer. Examples of state that needs to
be reset are next expected rx sequence number from, or messages to be
retransmitted to, the new incarnation of the peer.

To achieve this, the RDS handshake probe added as part of
commit 5916e2c1 ("RDS: TCP: Enable multipath RDS for TCP")
is enhanced so that sender and receiver of the RDS ping-probe
will add a generation number as part of the RDS_EXTHDR_GEN_NUM
extension header. Each peer stores local and remote generation
numbers as part of each rds_connection. Changes in generation
number will be detected via incoming handshake probe ping
request or response and will allow the receiver to reset rds_connection
state.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

905dd418

16 7月, 2016 1 次提交

RDS: TCP: Enable multipath RDS for TCP · 5916e2c1

由 Sowmini Varadhan 提交于 7月 14, 2016

Use RDS probe-ping to compute how many paths may be used with
the peer, and to synchronously start the multiple paths. If mprds is
supported, hash outgoing traffic to one of multiple paths in rds_sendmsg()
when multipath RDS is supported by the transport.

CC: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5916e2c1

08 2月, 2015 1 次提交

rds: Make rds_message_copy_from_user() return 0 on success. · d0a47d32

由 Sowmini Varadhan 提交于 2月 05, 2015

Commit 083735f4 ("rds: switch rds_message_copy_from_user() to iov_iter")
breaks rds_message_copy_from_user() semantics on success, and causes it
to return nbytes copied, when it should return 0. This commit fixes that bug.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d0a47d32

16 12月, 2014 1 次提交

rds: Fix min() warning in rds_message_inc_copy_to_user() · 6ff4a8ad

由 Geert Uytterhoeven 提交于 12月 15, 2014

net/rds/message.c: In function ‘rds_message_inc_copy_to_user’:
net/rds/message.c:328: warning: comparison of distinct pointer types lacks a cast

Use min_t(unsigned long, ...) like is done in
rds_message_copy_from_user().
Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6ff4a8ad

24 11月, 2014 2 次提交
- A
  rds: switch rds_message_copy_from_user() to iov_iter · 083735f4
  由 Al Viro 提交于 11月 20, 2014
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  083735f4
- A
  rds: switch ->inc_copy_to_user() to passing iov_iter · c310e72c
  由 Al Viro 提交于 11月 20, 2014
```
instances get considerably simpler from that...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  c310e72c
05 3月, 2013 2 次提交

rds: simplify a warning message · 7dac1b51

由 Cong Wang 提交于 3月 03, 2013

Cc: David S. Miller <davem@davemloft.net>
Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: NCong Wang <amwang@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7dac1b51

rds: limit the size allocated by rds_message_alloc() · ece6b0a2

由 Cong Wang 提交于 3月 03, 2013

Dave Jones reported the following bug:

"When fed mangled socket data, rds will trust what userspace gives it,
and tries to allocate enormous amounts of memory larger than what
kmalloc can satisfy."

WARNING: at mm/page_alloc.c:2393 __alloc_pages_nodemask+0xa0d/0xbe0()
Hardware name: GA-MA78GM-S2H
Modules linked in: vmw_vsock_vmci_transport vmw_vmci vsock fuse bnep dlci bridge 8021q garp stp mrp binfmt_misc l2tp_ppp l2tp_core rfcomm s
Pid: 24652, comm: trinity-child2 Not tainted 3.8.0+ #65
Call Trace:
 [<ffffffff81044155>] warn_slowpath_common+0x75/0xa0
 [<ffffffff8104419a>] warn_slowpath_null+0x1a/0x20
 [<ffffffff811444ad>] __alloc_pages_nodemask+0xa0d/0xbe0
 [<ffffffff8100a196>] ? native_sched_clock+0x26/0x90
 [<ffffffff810b2128>] ? trace_hardirqs_off_caller+0x28/0xc0
 [<ffffffff810b21cd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff811861f8>] alloc_pages_current+0xb8/0x180
 [<ffffffff8113eaaa>] __get_free_pages+0x2a/0x80
 [<ffffffff811934fe>] kmalloc_order_trace+0x3e/0x1a0
 [<ffffffff81193955>] __kmalloc+0x2f5/0x3a0
 [<ffffffff8104df0c>] ? local_bh_enable_ip+0x7c/0xf0
 [<ffffffffa0401ab3>] rds_message_alloc+0x23/0xb0 [rds]
 [<ffffffffa04043a1>] rds_sendmsg+0x2b1/0x990 [rds]
 [<ffffffff810b21cd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff81564620>] sock_sendmsg+0xb0/0xe0
 [<ffffffff810b2052>] ? get_lock_stats+0x22/0x70
 [<ffffffff810b24be>] ? put_lock_stats.isra.23+0xe/0x40
 [<ffffffff81567f30>] sys_sendto+0x130/0x180
 [<ffffffff810b872d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff816c547b>] ? _raw_spin_unlock_irq+0x3b/0x60
 [<ffffffff816cd767>] ? sysret_check+0x1b/0x56
 [<ffffffff810b8695>] ? trace_hardirqs_on_caller+0x115/0x1a0
 [<ffffffff81341d8e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff816cd742>] system_call_fastpath+0x16/0x1b
---[ end trace eed6ae990d018c8b ]---
Reported-by: NDave Jones <davej@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: NCong Wang <amwang@redhat.com>
Acked-by: NVenkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ece6b0a2

01 11月, 2011 1 次提交

net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules · bc3b2d7f

由 Paul Gortmaker 提交于 7月 15, 2011

These files are non modular, but need to export symbols using
the macros now living in export.h -- call out the include so
that things won't break when we remove the implicit presence
of module.h from everywhere.
Signed-off-by: NPaul Gortmaker <paul.gortmaker@windriver.com>

bc3b2d7f

09 11月, 2010 1 次提交

rds: Fix rds message leak in rds_message_map_pages · aa58163a

由 Pavel Emelyanov 提交于 11月 08, 2010

The sgs allocation error path leaks the allocated message.
Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
Acked-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa58163a

31 10月, 2010 1 次提交

RDS: Let rds_message_alloc_sgs() return NULL · d139ff09

由 Andy Grover 提交于 10月 28, 2010

Even with the previous fix, we still are reading the iovecs once
to determine SGs needed, and then again later on. Preallocating
space for sg lists as part of rds_message seemed like a good idea
but it might be better to not do this. While working to redo that
code, this patch attempts to protect against userspace rewriting
the rds_iovec array between the first and second accesses.

The consequences of this would be either a too-small or too-large
sg list array. Too large is not an issue. This patch changes all
callers of message_alloc_sgs to handle running out of preallocated
sgs, and fail gracefully.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d139ff09

21 10月, 2010 1 次提交

rds: make local functions/variables static · ff51bf84

由 stephen hemminger 提交于 10月 19, 2010

The RDS protocol has lots of functions that should be
declared static. rds_message_get/add_version_extension is
removed since it defined but never used.
Signed-off-by: NStephen Hemminger <shemminger@vyatta.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ff51bf84

09 9月, 2010 19 次提交

rds: don't let RDS shutdown a connection while senders are present · 7e3f2952

由 Chris Mason 提交于 5月 11, 2010

This is the first in a long line of patches that tries to fix races
between RDS connection shutdown and RDS traffic.

Here we are maintaining a count of active senders to make sure
the connection doesn't go away while they are using it.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

7e3f2952

rds: per-rm flush_wait waitq · c83188dc

由 Chris Mason 提交于 4月 21, 2010

This removes a global waitqueue used to wait for rds messages
and replaces it with a waitqueue inside the rds_message struct.

The global waitqueue turns into a global lock and significantly
bottlenecks operations on large machines.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

c83188dc

RDS: Do wait_event_interruptible instead of wait_event · a40aa923

由 Andy Grover 提交于 3月 29, 2010

Can't see a reason not to allow signals to interrupt the wait.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

a40aa923

A
RDS: rds_message_unmapped() doesn't need to check if queue active · ab1a6926
由 Andy Grover 提交于 3月 29, 2010
```
If the queue has nobody on it, then wake_up does nothing.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
ab1a6926
A
RDS: Use NOWAIT in message_map_pages() · f2ec76f2
由 Andy Grover 提交于 3月 29, 2010
```
Can no longer block, so use NOWAIT.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
f2ec76f2

RDS: Add a warning if trying to allocate 0 sgs · ee4c7b47

由 Andy Grover 提交于 2月 03, 2010

rds_message_alloc_sgs() only works when nents is nonzero.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

ee4c7b47

RDS: Do not set op_active in r_m_copy_from_user(). · 372cd7de

由 Andy Grover 提交于 2月 03, 2010

Do not allocate sgs for data for 0-length datagrams

Set data.op_active in rds_sendmsg() instead of
rds_message_copy_from_user().
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

372cd7de

A
RDS: Rename data op members prefix from m_ to op_ · 6c7cc6e4
由 Andy Grover 提交于 1月 27, 2010
```
For consistency.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
6c7cc6e4

RDS: Remove struct rds_rdma_op · f8b3aaf2

由 Andy Grover 提交于 3月 01, 2010

A big changeset, but it's all pretty dumb.

struct rds_rdma_op was already embedded in struct rm_rdma_op.
Remove rds_rdma_op and put its members in rm_rdma_op. Rename
members with "op_" prefix instead of "r_", for consistency.

Of course this breaks a lot, so fixup the code accordingly.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

f8b3aaf2

RDS: purge atomic resources too in rds_message_purge() · d0ab25a8

由 Andy Grover 提交于 1月 27, 2010

Add atomic_free_op function, analogous to rdma_free_op,
and call it in rds_message_purge().
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

d0ab25a8

A
RDS: Implement silent atomics · 241eef3e
由 Andy Grover 提交于 1月 19, 2010
```
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
241eef3e

RDS: Move loop-only function to loop.c · d37c9359

由 Andy Grover 提交于 1月 19, 2010

Also, try to better-document the locking around the
rm and its m_inc in loop.c.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

d37c9359

A
RDS: inc_purge() transport function unused - remove it · 809fa148
由 Andy Grover 提交于 1月 12, 2010
```
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
809fa148

RDS: make sure all sgs alloced are initialized · f4dd96f7

由 Andy Grover 提交于 1月 12, 2010

rds_message_alloc_sgs() now returns correctly-initialized
sg lists, so calleds need not do this themselves.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

f4dd96f7

RDS: make m_rdma_op a member of rds_message · ff87e97a

由 Andy Grover 提交于 1月 12, 2010

This eliminates a separate memory alloc, although
it is now necessary to add an "r_active" flag, since
it is no longer to use the m_rdma_op pointer as an
indicator of if an rdma op is present.

rdma SGs allocated from rm sg pool.

rds_rm_size also gets bigger. It's a little inefficient to
run through CMSGs twice, but it makes later steps a lot smoother.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

ff87e97a

RDS: fold rdma.h into rds.h · 21f79afa

由 Andy Grover 提交于 1月 12, 2010

RDMA is now an intrinsic part of RDS, so it's easier to just have
a single header.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

21f79afa

RDS: Explicitly allocate rm in sendmsg() · fc445084

由 Andy Grover 提交于 1月 12, 2010

r_m_copy_from_user used to allocate the rm as well as kernel
buffers for the data, and then copy the data in. Now, sendmsg()
allocates the rm, although the data buffer alloc still happens
in r_m_copy_from_user.

SGs are still allocated with rm, but now r_m_alloc_sgs() is
used to reserve them. This allows multiple SG lists to be
allocated from the one rm -- this is important once we also
want to alloc our rdma sgl from this pool.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

fc445084

RDS: break out rdma and data ops into nested structs in rds_message · e779137a

由 Andy Grover 提交于 1月 12, 2010

Clearly separate rdma-related variables in rm from data-related ones.
This is in anticipation of adding atomic support.
Signed-off-by: NAndy Grover <andy.grover@oracle.com>

e779137a

A
RDS: cleanup: remove "== NULL"s and "!= NULL"s in ptr comparisons · 8690bfa1
由 Andy Grover 提交于 1月 12, 2010
```
Favor "if (foo)" style over "if (foo != NULL)".
Signed-off-by: NAndy Grover <andy.grover@oracle.com>
```
8690bfa1

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功