- 09 3月, 2021 2 次提交
-
-
由 Alexander Aring 提交于
This patch sets the CF_CONNECTED bit when dlm accepts a connection from another node. If we don't set this bit, next time if the connection socket gets writable it will assume an event that the connection is successfully connected. However that is only the case when the connection did a connect. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch fixes an deadlock issue when dlm_lowcomms_close() is called. When dlm_lowcomms_close() is called the clusters_root.subsys.su_mutex is held to remove configfs items. At this time we flushing (e.g. cancel_work_sync()) the workers of send and recv workqueue. Due the fact that we accessing configfs items (mark values), these workers will lock clusters_root.subsys.su_mutex as well which are already hold by dlm_lowcomms_close() and ends in a deadlock situation. [67170.703046] ====================================================== [67170.703965] WARNING: possible circular locking dependency detected [67170.704758] 5.11.0-rc4+ #22 Tainted: G W [67170.705433] ------------------------------------------------------ [67170.706228] dlm_controld/280 is trying to acquire lock: [67170.706915] ffff9f2f475a6948 ((wq_completion)dlm_recv){+.+.}-{0:0}, at: __flush_work+0x203/0x4c0 [67170.708026] but task is already holding lock: [67170.708758] ffffffffa132f878 (&clusters_root.subsys.su_mutex){+.+.}-{3:3}, at: configfs_rmdir+0x29b/0x310 [67170.710016] which lock already depends on the new lock. The new behaviour adds the mark value to the node address configuration which doesn't require to held the clusters_root.subsys.su_mutex by accessing mark values in a separate datastructure. However the mark values can be set now only after a node address was set which is the case when the user is using dlm_controld. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
- 11 11月, 2020 12 次提交
-
-
由 Alexander Aring 提交于
This patch checks if we add twice the same address to a per node address array. This should never be the case and we report -EEXIST to the user space. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch just constify some function parameter which should be have a read access only. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch will use the runtime array size dlm_local_count variable to check the actual size of the dlm_local_addr array. There exists currently a cleanup bug, because the tcp_listen_for_all() functionality might check on a dangled pointer. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch introduces a own connection structure for the listen socket handling instead of handling the listen socket as normal connection structure in the connection hash. We can remove some nodeid equals zero validation checks, because this nodeid should not exists anymore inside the node hash. This patch also removes the sock mutex in accept_from_sock() function because this function can't occur in another parallel context if it's scheduled on only one workqueue. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch refactors sctp_bind_addrs() to work with a socket parameter instead of a connection parameter. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch move the assignment for the shutdown action callback to the node creation functionality. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch moves the assignment for the connect callback to the node creation instead of assign some dummy functionality. The assignment which connect functionality will be used will be detected according to the configfs setting. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch will move the connection structure initialization into an own function. This avoids cases to update the othercon initialization. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
The manpage of connect shows that in non blocked mode a writeability indicates successful connection event. This patch is handling this event inside the writeability callback. In case of SCTP we use blocking connect functionality which indicates a successful connect when the function returns with a successful return value. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch ensures we also flush the othercon writequeue when a lowcomms close occurs. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch adds an error handling to the get buffer functionality if the user is requesting a buffer length which is more than possible of the internal buffer allocator. This should never happen because specific handling decided by compile time, but will warn if somebody forget about to handle this limitation right. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch will use call_srcu() instead of call_rcu() because the related datastructure resource are handled under srcu context. I assume the current code is fine anyway since free_conn() must be called when the related resource are not in use otherwise. However it will correct the overall handling in a srcu context. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
- 01 10月, 2020 1 次提交
-
-
由 Alexander Aring 提交于
This patch fixes a race in nodeid2con in cases that we parallel running a lookup and both will create a connection structure for the same nodeid. It's a rare case to create a new connection structure to keep reader lockless we just do a lookup inside the protection area again and drop previous work if this race happens. Fixes: a47666eb ("fs: dlm: make connection hash lockless") Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
- 30 9月, 2020 3 次提交
-
-
由 Alexander Aring 提交于
This patch reworks the current receive handling of dlm. As I tried to change the send handling to fix reorder issues I took a look into the receive handling and simplified it, it works as the following: Each connection has a preallocated receive buffer with a minimum length of 4096. On receive, the upper layer protocol will process all dlm message until there is not enough data anymore. If there exists "leftover" data at the end of the receive buffer because the dlm message wasn't fully received it will be copied to the begin of the preallocated receive buffer. Next receive more data will be appended to the previous "leftover" data and processing will begin again. This will remove a lot of code of the current mechanism. Inside the processing functionality we will ensure with a memmove() that the dlm message should be memory aligned. To have a dlm message always started at the beginning of the buffer will reduce some amount of memmove() calls because src and dest pointers are the same. The cluster attribute "buffer_size" becomes a new meaning, it's now the size of application layer receive buffer size. If this is changed during runtime the receive buffer will be reallocated. It's important that the receive buffer size has at minimum the size of the maximum possible dlm message size otherwise the received message cannot be placed inside the receive buffer size. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch fixes to set per nodeid mark configuration for accepted sockets as well. Before this patch only the listen socket mark value was used for all accepted connections. This patch will ensure that the cluster mark attribute value will be always used for all sockets, if a per nodeid mark value is specified dlm will use this value for the specific node. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
During my experiments to make dlm robust against tcpkill application I was able to run sometimes in a circular lock dependency warning between clusters_root.subsys.su_mutex and con->sock_mutex. We don't need to held the sock_mutex when getting the mark value which held the clusters_root.subsys.su_mutex. This patch moves the specific handling just before the sock_mutex will be held. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
- 28 8月, 2020 6 次提交
-
-
由 Alexander Aring 提交于
This patch use free_con() functionality to free the listen connection if listen fails. It also fixes an issue that a freed resource is still part of the connection_hash as hlist_del() is not called in this case. The only difference is that free_con() handles othercon as well, but this is never been set for the listen connection. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch adds free of possible other writequeue entries in othercon member of struct connection. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch just move the free of struct connection member writequeue into the functionality when struct connection will be freed instead of doing two iterations. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch fixes the following memory detected by kmemleak and umount gfs2 filesystem which removed the last lockspace: unreferenced object 0xffff9264f4f48f00 (size 128): comm "mount", pid 425, jiffies 4294690253 (age 48.159s) hex dump (first 32 bytes): 02 00 52 48 c0 a8 7a fb 00 00 00 00 00 00 00 00 ..RH..z......... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000067a34940>] kmemdup+0x18/0x40 [<00000000c935f9ab>] init_local+0x4c/0xa0 [<00000000bbd286ef>] dlm_lowcomms_start+0x28/0x160 [<00000000a86625cb>] dlm_new_lockspace+0x7e/0xb80 [<000000008df6cd63>] gdlm_mount+0x1cc/0x5de [<00000000b67df8c7>] gfs2_lm_mount.constprop.0+0x1a3/0x1d3 [<000000006642ac5e>] gfs2_fill_super+0x717/0xba9 [<00000000d3ab7118>] get_tree_bdev+0x17f/0x280 [<000000001975926e>] gfs2_get_tree+0x21/0x90 [<00000000561ce1c4>] vfs_get_tree+0x28/0xc0 [<000000007fecaf63>] path_mount+0x434/0xc00 [<00000000636b9594>] __x64_sys_mount+0xe3/0x120 [<00000000cc478a33>] do_syscall_64+0x33/0x40 [<00000000ce9ccf01>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
There are some problems with the connections_lock. During my experiements I saw sometimes circular dependencies with sock_lock. The reason here might be code parts which runs nodeid2con() before or after sock_lock is acquired. Another issue are missing locks in for_conn() iteration. Maybe this works fine because for_conn() is running in a context where connection_hash cannot be manipulated by others anymore. However this patch changes the connection_hash to be protected by sleepable rcu. The hotpath function __find_con() is implemented lockless as it is only a reader of connection_hash and this hopefully fixes the circular locking dependencies. The iteration for_conn() will still call some sleepable functionality, that's why we use sleepable rcu in this case. This patch removes the kmemcache functionality as I think I need to make some free() functionality via call_rcu(). However allocation time isn't here an issue. The dlm_allow_con will not be protected by a lock anymore as I think it's enough to just set and flush workqueues afterwards. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch moves the dlm workqueue dlm synchronization before shutdown handling. The patch just flushes all pending work before starting to shutdown the connection. At least for the send_workqeue we should flush the workqueue to make sure there is no new connection handling going on as dlm_allow_conn switch is turned to false before. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
- 06 8月, 2020 5 次提交
-
-
由 Alexander Aring 提交于
During my code inspection I saw there is no implementation of a graceful shutdown for tcp. This patch will introduce a graceful shutdown for tcp connections. The shutdown is implemented synchronized as dlm_lowcomms_stop() is called to end all dlm communication. After shutdown is done, a lot of flush and closing functionality will be called. However I don't see a problem with that. The waitqueue for synchronize the shutdown has a timeout of 10 seconds, if timeout a force close will be exectued. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch changes the handling of reconnects. At first we only close the connection related to the communication failure. If we get a new connection for an already existing connection we close the existing connection and take the new one. This patch improves significantly the stability of tcp connections while running "tcpkill -9 -i $IFACE port 21064" while generating a lot of dlm messages e.g. on a gfs2 mount with many files. My test setup shows that a deadlock is "more" unlikely. Before this patch I wasn't able to get not a deadlock after 5 seconds. After this patch my observation is that it's more likely to survive after 5 seconds and more, but still a deadlock occurs after certain time. My guess is that there are still "segments" inside the tcp writequeue or retransmit queue which get dropped when receiving a tcp reset [1]. Hard to reproduce because the right message need to be inside these queues, which might even be in the 5 first seconds with this patch. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv4/tcp_input.c?h=v5.8-rc6#n4122Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch doesn't close sockets when there is an invalid dlm message received. The connection will probably reconnect anyway so. To not close the connection will reduce the number of possible failtures. As we don't have a different strategy to react on such scenario just keep going the connection and ignore the message. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch adds support to set the skb mark value for the DLM tcp and sctp socket per peer. The mark value will be offered as per comm value of configfs. At creation time of the peer socket it will be set as socket option. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
由 Alexander Aring 提交于
This patch adds support to set the skb mark value for the DLM listen tcp and sctp sockets. The mark value will be offered as cluster configuration. At creation time of the listen socket it will be set as socket option. Signed-off-by: NAlexander Aring <aahringo@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
- 30 5月, 2020 2 次提交
-
-
由 Christoph Hellwig 提交于
The SCTP protocol allows to bind multiple address to a socket. That feature is currently only exposed as a socket option. Add a bind_add method struct proto that allows to bind additional addresses, and switch the dlm code to use the method instead of going through the socket option from kernel space. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christoph Hellwig 提交于
Add a helper to directly set the SCTP_NODELAY sockopt from kernel space without going through a fake uaccess. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 29 5月, 2020 5 次提交
-
-
由 Christoph Hellwig 提交于
Add a helper to directly set the TCP_NODELAY sockopt from kernel space without going through a fake uaccess. Cleanup the callers to avoid pointless wrappers now that this is a simple function call. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NSagi Grimberg <sagi@grimberg.me> Acked-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christoph Hellwig 提交于
Add a helper to directly set the SO_RCVBUFFORCE sockopt from kernel space without going through a fake uaccess. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christoph Hellwig 提交于
Add a helper to directly set the SO_KEEPALIVE sockopt from kernel space without going through a fake uaccess. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christoph Hellwig 提交于
Add a helper to directly set the SO_SNDTIMEO_NEW sockopt from kernel space without going through a fake uaccess. The interface is simplified to only pass the seconds value, as that is the only thing needed at the moment. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Christoph Hellwig 提交于
Add a helper to directly set the SO_REUSEADDR sockopt from kernel space without going through a fake uaccess. For this the iscsi target now has to formally depend on inet to avoid a mostly theoretical compile failure. For actual operation it already did depend on having ipv4 or ipv6 support. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NSagi Grimberg <sagi@grimberg.me> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 28 5月, 2020 1 次提交
-
-
由 Christoph Hellwig 提交于
The only difference between a few missing fixes applied to the SCTP one is that TCP uses ->getpeername to get the remote address, while SCTP uses kernel_getsockopt(.. SCTP_PRIMARY_ADDR). But given that getpeername is defined to return the primary address for sctp, there doesn't seem to be any reason for the different way of quering the peername, or all the code duplication. Signed-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 19 12月, 2019 1 次提交
-
-
由 Arnd Bergmann 提交于
Eliminate one more use of 'struct timeval' from the kernel so we can eventually remove the definition as well. The kernel supports the new format with a 64-bit time_t version of timeval here, so use that instead of the old timeval. Acked-by: NDeepa Dinamani <deepa.kernel@gmail.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de>
-
- 12 7月, 2019 1 次提交
-
-
由 David Windsor 提交于
If the DLM lowcomms stack is shut down before any DLM traffic can be generated, flush_workqueue() and destroy_workqueue() can be called on empty send and/or recv workqueues. Insert guard conditionals to only call flush_workqueue() and destroy_workqueue() on workqueues that are not NULL. Signed-off-by: NDavid Windsor <dwindsor@redhat.com> Signed-off-by: NDavid Teigland <teigland@redhat.com>
-
- 31 5月, 2019 1 次提交
-
-
由 Thomas Gleixner 提交于
Based on 1 normalized pattern(s): this copyrighted material is made available to anyone wishing to use modify copy or redistribute it subject to the terms and conditions of the gnu general public license v 2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 45 file(s). Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Reviewed-by: NRichard Fontana <rfontana@redhat.com> Reviewed-by: NAllison Randal <allison@lohutok.net> Reviewed-by: NSteve Winslow <swinslow@gmail.com> Reviewed-by: NAlexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528170027.342746075@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
-