提交 · b3a5bbfd780d9e9291f5f257be06e9ad6db11657 · openeuler / raspberrypi-kernel

27 8月, 2015 1 次提交

dlm: print error from kernel_sendpage · b3a5bbfd

由 Bob Peterson 提交于 8月 27, 2015

Print a dlm-specific error when a socket error occurs
when sending a dlm message.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

b3a5bbfd

18 8月, 2015 7 次提交

dlm: sctp_accept_from_sock() can be static · 18df8a87

由 kbuild test robot 提交于 8月 18, 2015

Signed-off-by: NFengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

18df8a87

dlm: fix reconnecting but not sending data · 00dcffae

由 Marcelo Ricardo Leitner 提交于 8月 11, 2015

There are cases on which lowcomms_connect_sock() is called directly,
which caused the CF_WRITE_PENDING flag to not bet set upon reconnect,
specially on send_to_sock() error handling. On this last, the flag was
already cleared and no further attempt on transmitting would be done.

As dlm tends to connect when it needs to transmit something, it makes
sense to always mark this flag right after the connect.
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

00dcffae

dlm: replace BUG_ON with a less severe handling · acee4e52

由 Marcelo Ricardo Leitner 提交于 8月 11, 2015

BUG_ON() is a severe action for this case, specially now that DLM with
SCTP will use 1 socket per association. Instead, we can just close the
socket on this error condition and return from the function.

Also move the check to an earlier stage as it won't change and thus we
can abort as soon as possible.

Although this issue was reported when still using SCTP with 1-to-many
API, this cleanup wouldn't be that simple back then because we couldn't
close the socket and making sure such event would cease would be hard.
And actually, previous code was closing the association, yet SCTP layer
is still raising the new data event. Probably a bug to be fixed in SCTP.

Reported-by: <tan.hu@zte.com.cn>
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

acee4e52

dlm: use sctp 1-to-1 API · ee44b4bc

由 Marcelo Ricardo Leitner 提交于 8月 11, 2015

DLM is using 1-to-many API but in a 1-to-1 fashion. That is, it's not
needed but this causes it to use sctp_do_peeloff() to mimic an
kernel_accept() and this causes a symbol dependency on sctp module.

By switching it to 1-to-1 API we can avoid this dependency and also
reduce quite a lot of SCTP-specific code in lowcomms.c.

The caveat is that now DLM won't always use the same src port. It will
choose a random one, just like TCP code. This allows the peers to
attempt simultaneous connections, which now are handled just like for
TCP.

Even more sharing between TCP and SCTP code on DLM is possible, but it
is intentionally left for a later commit.

Note that for using nodes with this commit, you have to have at least
the early fixes on this patchset otherwise it will trigger some issues
on old nodes.
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

ee44b4bc

dlm: fix not reconnecting on connecting error handling · 356344c4

由 Marcelo Ricardo Leitner 提交于 8月 11, 2015

If we don't clear that bit, lowcomms_connect_sock() will not schedule
another attempt, and no further attempt will be done.
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

356344c4

dlm: fix race while closing connections · 0d737a8c

由 Marcelo Ricardo Leitner 提交于 8月 11, 2015

When a connection have issues DLM may need to close it.  Therefore we
should also cancel pending workqueues for such connection at that time,
and not just when dlm is not willing to use this connection anymore.

Also, if we don't clear CF_CONNECT_PENDING flag, the error handling
routines won't be able to re-connect as lowcomms_connect_sock() will
check for it.
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

0d737a8c

dlm: fix connection stealing if using SCTP · 28926a09

由 Marcelo Ricardo Leitner 提交于 8月 11, 2015

When using SCTP and accepting a new connection, DLM currently validates
if the peer trying to connect to it is one of the cluster nodes, but it
doesn't check if it already has a connection to it or not.

If it already had a connection, it will be overwritten, and the new one
will be used for writes, possibly causing the node to leave the cluster
due to communication breakage.

Still, one could DoS the node by attempting N connections and keeping
them open.

As said, but being explicit, both situations are only triggerable from
other cluster nodes, but are doable with only user-level perms.
Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

28926a09

11 5月, 2015 1 次提交

net: Add a struct net parameter to sock_create_kern · eeb1bd5c

由 Eric W. Biederman 提交于 5月 08, 2015

This is long overdue, and is part of cleaning up how we allocate kernel
sockets that don't reference count struct net.
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eeb1bd5c

12 6月, 2014 1 次提交

dlm: keep listening connection alive with sctp mode · 883854c5

由 Lidong Zhong 提交于 6月 12, 2014

The connection struct with nodeid 0 is the listening socket,
not a connection to another node.  The sctp resend function
was not checking that the nodeid was valid (non-zero), so it
would mistakenly get and resend on the listening connection
when nodeid was zero.
Signed-off-by: NLidong Zhong <lzhong@suse.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

883854c5

12 4月, 2014 1 次提交

net: Fix use after free by removing length arg from sk_data_ready callbacks. · 676d2369

由 David S. Miller 提交于 4月 11, 2014

Several spots in the kernel perform a sequence like:

	skb_queue_tail(&sk->s_receive_queue, skb);
	sk->sk_data_ready(sk, skb->len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up.  So this skb->len access is potentially
to freed up memory.

Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument.  And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

676d2369

22 1月, 2014 1 次提交

sctp: remove macros sctp_{lock|release}_sock · 048ed4b6

由 wangweidong 提交于 1月 21, 2014

Redefined {lock|release}_sock to sctp_{lock|release}_sock for user space friendly
code which we haven't use in years, so removing them.
Signed-off-by: NWang Weidong <wangweidong1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

048ed4b6

16 12月, 2013 1 次提交

dlm: set zero linger time on sctp socket · ece35848

由 Dongmao Zhang 提交于 12月 10, 2013

The recovery time for a failed node was taking a long
time because the failed node could not perform the full
shutdown process.  Removing the linger time speeds this
up.  The dlm does not care what happens to messages to
or from the failed node.
Signed-off-by: NDongmao Zhang <dmzhang@suse.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

ece35848

19 6月, 2013 1 次提交

dlm: remove duplicated include from lowcomms.c · 06452eb0

由 Wei Yongjun 提交于 6月 19, 2013

Remove duplicated include.
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

06452eb0

15 6月, 2013 6 次提交

dlm: disable nagle for SCTP · 86e92ad2

由 Mike Christie 提交于 6月 14, 2013

For TCP we disable Nagle and I cannot think of why it would be needed
for SCTP. When disabled it seems to improve dlm_lock operations like it
does for TCP.
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

86e92ad2

dlm: retry failed SCTP sends · 5d689871

由 Mike Christie 提交于 6月 14, 2013

Currently if a SCTP send fails, we lose the data we were trying
to send because the writequeue_entry is released when we do the send.
When this happens other nodes will then hang waiting for a reply.

This adds support for SCTP to retry the send operation.

I also removed the retry limit for SCTP use, because we want
to make sure we try every path during init time and for longer
failures we want to continually retry in case paths come back up
while trying other paths. We will do this until userspace tells us
to stop.
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

5d689871

dlm: try other IPs when sctp init assoc fails · 98e1b60e

由 Mike Christie 提交于 6月 14, 2013

Currently, if we cannot create a association to the first IP addr
that is added to DLM, the SCTP init assoc code will just retry
the same IP. This patch adds a simple failover schemes where we
will try one of the addresses that was passed into DLM.
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

98e1b60e

dlm: clear correct bit during sctp init failure handling · b390ca38

由 Mike Christie 提交于 6月 14, 2013

We should be testing and cleaing the init pending bit because later
when sctp_init_assoc is recalled it will be checking that it is not set
and set the bit.

We do not want to touch CF_CONNECT_PENDING here because we will queue
swork and process_send_sockets will then call the connect_action function.
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

b390ca38

dlm: set sctp assoc id during setup · e1631d0c

由 Mike Christie 提交于 6月 14, 2013

sctp_assoc was not getting set so later lookups failed.
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

e1631d0c

dlm: clear correct init bit during sctp setup · efad7e6b

由 Mike Christie 提交于 6月 14, 2013

We were clearing the base con's init pending flags, but the
con for the node was the one with the pending bit set.
Signed-off-by: NMike Christie <michaelc@cs.wisc.edu>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

efad7e6b

10 4月, 2013 1 次提交

net: sctp: introduce uapi header for sctp · 1b866434

由 Daniel Borkmann 提交于 4月 09, 2013

This patch introduces an UAPI header for the SCTP protocol,
so that we can facilitate the maintenance and development of
user land applications or libraries, in particular in terms
of header synchronization.

To not break compatibility, some fragments from lksctp-tools'
netinet/sctp.h have been carefully included, while taking care
that neither kernel nor user land breaks, so both compile fine
with this change (for lksctp-tools I tested with the old
netinet/sctp.h header and with a newly adapted one that includes
the uapi sctp header). lksctp-tools smoke test run through
successfully as well in both cases.
Suggested-by: NNeil Horman <nhorman@tuxdriver.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b866434

28 2月, 2013 1 次提交

hlist: drop the node parameter from iterators · b67bfe0d

由 Sasha Levin 提交于 2月 27, 2013

I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj->member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    <+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: NPeter Senna Tschudin <peter.senna@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: NSasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b67bfe0d

02 11月, 2012 1 次提交

dlm: remove unused variable in *dlm_lowcomms_get_buffer() · eeee2b5f

由 Wei Yongjun 提交于 10月 18, 2012

The variable users is initialized but never used
otherwise, so remove the unused variable.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

eeee2b5f

13 8月, 2012 1 次提交

dlm: cleanup send_to_sock routine · 9c5bef58

由 Ying Xue 提交于 8月 13, 2012

Remove unnecessary code form send_to_sock routine.
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

9c5bef58

10 8月, 2012 2 次提交

dlm: convert add_sock routine return value type to void · 4dd40f0c

由 Ying Xue 提交于 8月 10, 2012

Since add_sock() always returns a success code - 0, its return
value type should be changed from integer to void.
Signed-off-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

4dd40f0c

dlm: remove redundant variable assignments · b4c798cf

由 Xue Ying 提交于 8月 10, 2012

Once the tcp_create_listen_sock() is returned successfully, we
will invoke add_sock() immediately. In add_sock(), the 'con'
variable is assigned to 'sk_user_data', meanwhile, the 'sock' is
also set to 'con->sock'. So it's unnecessary to do the same thing
in tcp_create_listen_sock().
Signed-off-by: NXue Ying <ying.xue@windriver.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

b4c798cf

09 8月, 2012 1 次提交

dlm: fix deadlock between dlm_send and dlm_controld · 36b71a8b

由 David Teigland 提交于 7月 26, 2012

A deadlock sometimes occurs between dlm_controld closing
a lowcomms connection through configfs and dlm_send looking
up the address for a new connection in configfs.

dlm_controld does a configfs rmdir which calls
dlm_lowcomms_close which waits for dlm_send to
cancel work on the workqueues.

The dlm_send workqueue thread has called
tcp_connect_to_sock which calls dlm_nodeid_to_addr
which does a configfs lookup and blocks on a lock
held by dlm_controld in the rmdir path.

The solution here is to save the node addresses within
the lowcomms code so that the lowcomms workqueue does
not need to step through configfs to get a node address.

dlm_controld:
wait_for_completion+0x1d/0x20
__cancel_work_timer+0x1b3/0x1e0
cancel_work_sync+0x10/0x20
dlm_lowcomms_close+0x4c/0xb0 [dlm]
drop_comm+0x22/0x60 [dlm]
client_drop_item+0x26/0x50 [configfs]
configfs_rmdir+0x180/0x230 [configfs]
vfs_rmdir+0xbd/0xf0
do_rmdir+0x103/0x120
sys_rmdir+0x16/0x20

dlm_send:
mutex_lock+0x2b/0x50
get_comm+0x34/0x140 [dlm]
dlm_nodeid_to_addr+0x18/0xd0 [dlm]
tcp_connect_to_sock+0xf4/0x2d0 [dlm]
process_send_sockets+0x1d2/0x260 [dlm]
worker_thread+0x170/0x2a0
Signed-off-by: NDavid Teigland <teigland@redhat.com>

36b71a8b

27 4月, 2012 1 次提交

dlm: prevent connections during shutdown · 513ef596

由 David Teigland 提交于 3月 30, 2012

During lowcomms shutdown, a new connection could possibly
be created, and attempt to use a workqueue that's been
destroyed.  Similarly, during startup, a new connection
could attempt to use a workqueue that's not been set up
yet.  Add a global variable to indicate when new connections
are allowed.

Based on patch by: Christine Caulfield <ccaulfie@redhat.com>
Reported-by: Ndann frazier <dann.frazier@canonical.com>
Reviewed-by: Ndann frazier <dann.frazier@canonical.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

513ef596

21 3月, 2012 1 次提交

dlm: last element of dlm_local_addr[] never used · 1b189b88

由 David Teigland 提交于 3月 21, 2012

The last element of dlm_local_addr[DLM_MAX_ADDR_COUNT]
was not used because the loop ended at COUNT - 1.
Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

1b189b88

09 3月, 2012 1 次提交

dlm: Do not allocate a fd for peeloff · 2f2d76cc

由 Benjamin Poirier 提交于 3月 08, 2012

avoids allocating a fd that a) propagates to every kernel thread and
usermodehelper b) is not properly released.

References: http://article.gmane.org/gmane.linux.network.drbd/22529Signed-off-by: NBenjamin Poirier <bpoirier@suse.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2f2d76cc

23 11月, 2011 1 次提交

net: remove ipv6_addr_copy() · 4e3fd7a0

由 Alexey Dobriyan 提交于 11月 21, 2011

C assignment can handle struct in6_addr copying.
Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4e3fd7a0

07 7月, 2011 1 次提交

dlm: dump address of unknown node · bcaadf5c

由 Masatake YAMATO 提交于 7月 04, 2011

When the dlm fails to make a network connection to another
node, include the address of the node in the error message.
Signed-off-by: NMasatake YAMATO <yamato@redhat.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

bcaadf5c

31 3月, 2011 1 次提交

Fix common misspellings · 25985edc

由 Lucas De Marchi 提交于 3月 30, 2011

Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: NLucas De Marchi <lucas.demarchi@profusion.mobi>

25985edc

11 3月, 2011 1 次提交

dlm: use alloc_workqueue function · e43f055a

由 David Teigland 提交于 3月 10, 2011

Replaces deprecated create_singlethread_workqueue().
Signed-off-by: NDavid Teigland <teigland@redhat.com>

e43f055a

12 2月, 2011 1 次提交

dlm: use single thread workqueues · 6b155c8f

由 David Teigland 提交于 2月 11, 2011

The recent commit to use cmwq for send and recv threads
dcce240e introduced problems,
apparently due to multiple workqueue threads.  Single threads
make the problems go away, so return to that until we fully
understand the concurrency issues with multiple threads.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

6b155c8f

14 12月, 2010 1 次提交

dlm: sanitize work_start() in lowcomms.c · b9d41052

由 Namhyung Kim 提交于 12月 13, 2010

The create_workqueue() returns NULL if failed rather than ERR_PTR().
Fix error checking and remove unnecessary variable 'error'.
Signed-off-by: NNamhyung Kim <namhyung@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

b9d41052

13 11月, 2010 3 次提交

dlm: reduce cond_resched during send · f92c8dd7

由 Bob Peterson 提交于 11月 12, 2010

Calling cond_resched() after every send can unnecessarily
degrade performance.  Go back to an old method of scheduling
after 25 messages.
Signed-off-by: NBob Peterson <rpeterso@redhat.com>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

f92c8dd7

dlm: use TCP_NODELAY · cb2d45da

由 David Teigland 提交于 11月 12, 2010

Nagling doesn't help and can sometimes hurt dlm comms.
Signed-off-by: NDavid Teigland <teigland@redhat.com>

cb2d45da

dlm: Use cmwq for send and receive workqueues · dcce240e

由 Steven Whitehouse 提交于 11月 12, 2010

So far as I can tell, there is no reason to use a single-threaded
send workqueue for dlm, since it may need to send to several sockets
concurrently. Both workqueues are set to WQ_MEM_RECLAIM to avoid
any possible deadlocks, WQ_HIGHPRI since locking traffic is highly
latency sensitive (and to avoid a priority inversion wrt GFS2's
glock_workqueue) and WQ_FREEZABLE just in case someone needs to do
that (even though with current cluster infrastructure, it doesn't
make sense as the node will most likely land up ejected from the
cluster) in the future.
Signed-off-by: NSteven Whitehouse <swhiteho@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

dcce240e

12 11月, 2010 1 次提交

dlm: Handle application limited situations properly. · b36930dd

由 David Miller 提交于 11月 10, 2010

In the normal regime where an application uses non-blocking I/O
writes on a socket, they will handle -EAGAIN and use poll() to
wait for send space.

They don't actually sleep on the socket I/O write.

But kernel level RPC layers that do socket I/O operations directly
and key off of -EAGAIN on the write() to "try again later" don't
use poll(), they instead have their own sleeping mechanism and
rely upon ->sk_write_space() to trigger the wakeup.

So they do effectively sleep on the write(), but this mechanism
alone does not let the socket layers know what's going on.

Therefore they must emulate what would have happened, otherwise
TCP cannot possibly see that the connection is application window
size limited.

Handle this, therefore, like SUNRPC by setting SOCK_NOSPACE and
bumping the ->sk_write_count as needed when we hit the send buffer
limits.

This should make TCP send buffer size auto-tuning and the
->sk_write_space() callback invocations actually happen.
Signed-off-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NDavid Teigland <teigland@redhat.com>

b36930dd