- 29 8月, 2016 2 次提交
-
-
由 Tom Herbert 提交于
kcm and strparser need to work with any type of stream socket not just TCP. Eliminate references to TCP and call generic proto_ops functions of read_sock and peek_len. Also in strp_init check if the socket support the proto_ops read_sock and peek_len. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
In inet_stream_ops we set read_sock to tcp_read_sock and peek_len to tcp_peek_len (which is just a stub function that calls tcp_inq). Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 27 8月, 2016 11 次提交
-
-
由 Richard Alpe 提交于
When using replicast a UDP bearer can have an arbitrary amount of remote ip addresses associated with it. This means we cannot simply add all remote ip addresses to an existing bearer data message as it might fill the message, leaving us with a truncated message that we can't safely resume. To handle this we introduce the new netlink command TIPC_NL_UDP_GET_REMOTEIP. This command is intended to be called when the bearer data message has the TIPC_NLA_UDP_MULTI_REMOTEIP flag set, indicating there are more than one remote ip (replicast). Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Alpe 提交于
Add UDP bearer options to netlink bearer get message. This is used by the tipc user space tool to display UDP options. The UDP bearer information is passed using either a sockaddr_in or sockaddr_in6 structs. This means the user space receiver should intermediately store the retrieved data in a large enough struct (sockaddr_strage) before casting to the proper IP version type. Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Alpe 提交于
Automatically learn UDP remote IP addresses of communicating peers by looking at the source IP address of incoming TIPC link configuration messages (neighbor discovery). This makes configuration slightly easier and removes the problematic scenario where a node receives directly addressed neighbor discovery messages sent using replicast which the node cannot "reply" to using mutlicast, leaving the link FSM in a limbo state. Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Alpe 提交于
This patch introduces UDP replicast. A concept where we emulate multicast by sending multiple unicast messages to configured peers. The purpose of replicast is mainly to be able to use TIPC in cloud environments where IP multicast is disabled. Using replicas to unicast multicast messages is costly as we have to copy each skb and send the copies individually. Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Alpe 提交于
Add a function to check if a tipc UDP media address is a multicast address or not. This is a purely cosmetic change. Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Alpe 提交于
Split the UDP send function into two. One callback that prepares the skb and one transmit function that sends the skb. This will come in handy in later patches, when we introduce UDP replicast. Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Richard Alpe 提交于
Split the UDP netlink parse function so that it only parses one netlink attribute at the time. This makes the parse function more generic and allow future UDP API functions to use it for parsing. Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com> Reviewed-by: NJon Maloy <jon.maloy@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
switchdev_port_fwd_mark_set() is used to set the 'offload_fwd_mark' of port netdevs so that packets being flooded by the device won't be flooded twice. It works by assigning a unique identifier (the ifindex of the first bridge port) to bridge ports sharing the same parent ID. This prevents packets from being flooded twice by the same switch, but will flood packets through bridge ports belonging to a different switch. This method is problematic when stacked devices are taken into account, such as VLANs. In such cases, a physical port netdev can have upper devices being members in two different bridges, thus requiring two different 'offload_fwd_mark's to be configured on the port netdev, which is impossible. The main problem is that packet and netdev marking is performed at the physical netdev level, whereas flooding occurs between bridge ports, which are not necessarily port netdevs. Instead, packet and netdev marking should really be done in the bridge driver with the switch driver only telling it which packets it already forwarded. The bridge driver will mark such packets using the mark assigned to the ingress bridge port and will prevent the packet from being forwarded through any bridge port sharing the same mark (i.e. having the same parent ID). Remove the current switchdev 'offload_fwd_mark' implementation and instead implement the proposed method. In addition, make rocker - the sole user of the mark - use the proposed method. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ido Schimmel 提交于
switchdev_port_same_parent_id() currently expects port netdevs, but we need it to support stacked devices in the next patch, so drop the NO_RECURSE flag. Signed-off-by: NIdo Schimmel <idosch@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
Currently in process_backlog(), the process_queue dequeuing is performed with local IRQ disabled, to protect against flush_backlog(), which runs in hard IRQ context. This patch moves the flush operation to a work queue and runs the callback with bottom half disabled to protect the process_queue against dequeuing. Since process_queue is now always manipulated in bottom half context, the irq disable/enable pair around the dequeue operation are removed. To keep the flush time as low as possible, the flush works are scheduled on all online cpu simultaneously, using the high priority work-queue and statically allocated, per cpu, work structs. Overall this change increases the time required to destroy a device to improve slightly the packets reinjection performances. Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Nikolay Aleksandrov 提交于
When I added support to export the vlan entry flags via xstats I forgot to add support for the pvid since it is manually matched, so check if the entry matches the vlan_group's pvid and set the flag appropriately. Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 8月, 2016 2 次提交
-
-
由 Eric Dumazet 提交于
Adds SNMP counter for drops caused by MD5 mismatches. The current syslog might help, but a counter is more precise and helps monitoring. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
TCP MD5 mismatches do increment sk_drops counter in all states but SYN_RECV. This is very unlikely to happen in the real world, but worth adding to help diagnostics. We increase the parent (listener) sk_drops. Signed-off-by: NEric Dumazet <edumazet@google.com> Acked-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 8月, 2016 3 次提交
-
-
由 Lorenzo Colitti 提交于
This allows a privileged process to filter by socket mark when dumping sockets via INET_DIAG_BY_FAMILY. This is useful on systems that use mark-based routing such as Android. The ability to filter socket marks requires CAP_NET_ADMIN, which is consistent with other privileged operations allowed by the SOCK_DIAG interface such as the ability to destroy sockets and the ability to inspect BPF filters attached to packet sockets. Tested: https://android-review.googlesource.com/261350Signed-off-by: NLorenzo Colitti <lorenzo@google.com> Acked-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Lorenzo Colitti 提交于
This simplifies the code a bit and also allows inet_diag_bc_audit to send to userspace an error that isn't EINVAL. Signed-off-by: NLorenzo Colitti <lorenzo@google.com> Acked-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Vivien Didelot 提交于
Now that the dsa_switch_driver structure contains only function pointers as it is supposed to, rename it to the more appropriate dsa_switch_ops, uniformly to any other operations structure in the kernel. No functional changes here, basically just the result of something like: s/dsa_switch_driver *drv/dsa_switch_ops *ops/g However keep the {un,}register_switch_driver functions and their dsa_switch_drivers list as is, since they represent the -- likely to be deprecated soon -- legacy DSA registration framework. In the meantime, also fix the following checks from checkpatch.pl to make it happy with this patch: CHECK: Comparison to NULL could be written "!ops" #403: FILE: net/dsa/dsa.c:470: + if (ops == NULL) { CHECK: Comparison to NULL could be written "ds->ops->get_strings" #773: FILE: net/dsa/slave.c:697: + if (ds->ops->get_strings != NULL) CHECK: Comparison to NULL could be written "ds->ops->get_ethtool_stats" #824: FILE: net/dsa/slave.c:785: + if (ds->ops->get_ethtool_stats != NULL) CHECK: Comparison to NULL could be written "ds->ops->get_sset_count" #835: FILE: net/dsa/slave.c:798: + if (ds->ops->get_sset_count != NULL) total: 0 errors, 0 warnings, 4 checks, 784 lines checked Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Acked-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 8月, 2016 15 次提交
-
-
由 David Howells 提交于
Improve the management and caching of client rxrpc connection objects. From this point, client connections will be managed separately from service connections because AF_RXRPC controls the creation and re-use of client connections but doesn't have that luxury with service connections. Further, there will be limits on the numbers of client connections that may be live on a machine. No direct restriction will be placed on the number of client calls, excepting that each client connection can support a maximum of four concurrent calls. Note that, for a number of reasons, we don't want to simply discard a client connection as soon as the last call is apparently finished: (1) Security is negotiated per-connection and the context is then shared between all calls on that connection. The context can be negotiated again if the connection lapses, but that involves holding up calls whilst at least two packets are exchanged and various crypto bits are performed - so we'd ideally like to cache it for a little while at least. (2) If a packet goes astray, we will need to retransmit a final ACK or ABORT packet. To make this work, we need to keep around the connection details for a little while. (3) The locally held structures represent some amount of setup time, to be weighed against their occupation of memory when idle. To this end, the client connection cache is managed by a state machine on each connection. There are five states: (1) INACTIVE - The connection is not held in any list and may not have been exposed to the world. If it has been previously exposed, it was discarded from the idle list after expiring. (2) WAITING - The connection is waiting for the number of client conns to drop below the maximum capacity. Calls may be in progress upon it from when it was active and got culled. The connection is on the rxrpc_waiting_client_conns list which is kept in to-be-granted order. Culled conns with waiters go to the back of the queue just like new conns. (3) ACTIVE - The connection has at least one call in progress upon it, it may freely grant available channels to new calls and calls may be waiting on it for channels to become available. The connection is on the rxrpc_active_client_conns list which is kept in activation order for culling purposes. (4) CULLED - The connection got summarily culled to try and free up capacity. Calls currently in progress on the connection are allowed to continue, but new calls will have to wait. There can be no waiters in this state - the conn would have to go to the WAITING state instead. (5) IDLE - The connection has no calls in progress upon it and must have been exposed to the world (ie. the EXPOSED flag must be set). When it expires, the EXPOSED flag is cleared and the connection transitions to the INACTIVE state. The connection is on the rxrpc_idle_client_conns list which is kept in order of how soon they'll expire. A connection in the ACTIVE or CULLED state must have at least one active call upon it; if in the WAITING state it may have active calls upon it; other states may not have active calls. As long as a connection remains active and doesn't get culled, it may continue to process calls - even if there are connections on the wait queue. This simplifies things a bit and reduces the amount of checking we need do. There are a couple flags of relevance to the cache: (1) EXPOSED - The connection ID got exposed to the world. If this flag is set, an extra ref is added to the connection preventing it from being reaped when it has no calls outstanding. This flag is cleared and the ref dropped when a conn is discarded from the idle list. (2) DONT_REUSE - The connection should be discarded as soon as possible and should not be reused. This commit also provides a number of new settings: (*) /proc/net/rxrpc/max_client_conns The maximum number of live client connections. Above this number, new connections get added to the wait list and must wait for an active conn to be culled. Culled connections can be reused, but they will go to the back of the wait list and have to wait. (*) /proc/net/rxrpc/reap_client_conns If the number of desired connections exceeds the maximum above, the active connection list will be culled until there are only this many left in it. (*) /proc/net/rxrpc/idle_conn_expiry The normal expiry time for a client connection, provided there are fewer than reap_client_conns of them around. (*) /proc/net/rxrpc/idle_conn_fast_expiry The expedited expiry time, used when there are more than reap_client_conns of them around. Note that I combined the Tx wait queue with the channel grant wait queue to save space as only one of these should be in use at once. Note also that, for the moment, the service connection cache still uses the old connection management code. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
The main connection list is used for two independent purposes: primarily it is used to find connections to reap and secondarily it is used to list connections in procfs. Split the procfs list out from the reap list. This allows us to stop using the reap list for client connections when they acquire a separate management strategy from service collections. The client connections will not be on a management single list, and sometimes won't be on a management list at all. This doesn't leave them floating, however, as they will also be on an rb-tree rooted on the socket so that the socket can find them to dispatch calls. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Make /proc/net/rxrpc_calls safer by stashing a copy of the peer pointer in the rxrpc_call struct and checking in the show routine that the peer pointer, the socket pointer and the local pointer obtained from the socket pointer aren't NULL before we use them. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
If a duplicate packet comes in for a call that has just completed on a connection's channel then there will be an oops in the data_ready handler because it tries to examine the connection struct via a call struct (which we don't have - the pointer is unset). Since the connection struct pointer is available to us, go direct instead. Also, the ACK packet to be retransmitted needs three octets of padding between the soft ack list and the ackinfo. Fixes: 18bfeba5 ("rxrpc: Perform terminal call ACK/ABORT retransmission from conn processor") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 Eric Dumazet 提交于
We no longer use this handler, we can delete it. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Now RCU lookups of IPv6 TCP sockets no longer dereference pinet6, we do not need tcp_v6_clear_sk() anymore. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Since we no longer use SLAB_DESTROY_BY_RCU for UDP, we do not need sk_prot_clear_portaddr_nulls() helper. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Now RCU lookups of ipv6 udp sockets no longer dereference pinet6 field, we can get rid of udp_v6_clear_sk() helper. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
This implements SOCK_DESTROY for UDP sockets similar to what was done for TCP with commit c1e64e29 ("net: diag: Support destroying TCP sockets.") A process with a UDP socket targeted for destroy is awakened and recvmsg fails with ECONNABORTED. Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wei Yongjun 提交于
Use kfree_skb() instead of kfree() to free sk_buff. Fixes: 0d051bf9 ("tipc: make bearer packet filtering generic") Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Phil Sutter 提交于
Since the features bit field has bits for internal only use as well, it may happen that the kernel exports RTAX_FEATURES attribute with zero value which is pointless. Fix this by making sure the attribute is added only if the exported value is non-zero. Signed-off-by: NPhil Sutter <phil@nwl.cc> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yuchung Cheng 提交于
TFO_SERVER_WO_SOCKOPT2 was intended for debugging purposes during Fast Open development. Remove this config option and also update/clean-up the documentation of the Fast Open sysctl. Reported-by: NPiotr Jurkiewicz <piotr.jerzy.jurkiewicz@gmail.com> Signed-off-by: NYuchung Cheng <ycheng@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NNeal Cardwell <ncardwell@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gao Feng 提交于
Use PPP_ALLSTATIONS, PPP_UI, and SEND_SHUTDOWN instead of 0xff, 0x03, and 2 separately. Signed-off-by: NGao Feng <fgao@ikuai8.com> Acked-by: NGuillaume Nault <g.nault@alphalink.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
Lock the lower socket in kcm_unattach. Release during call to strp_done since that function cancels the RX timers and work queue with sync. Also added some status information in psock reporting. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Tom Herbert 提交于
When the upper layer unpauses a stream parser connection we need to queue rx_work to make sure no events are missed. Signed-off-by: NTom Herbert <tom@herbertland.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 8月, 2016 7 次提交
-
-
由 David Howells 提交于
Perform terminal call ACK/ABORT retransmission in the connection processor rather than in the call processor. With this change, once last_call is set, no more incoming packets will be routed to the corresponding call or any earlier calls on that channel (call IDs must only increase on a channel on a connection). Further, if a packet's callNumber is before the last_call ID or a packet is aimed at successfully completed service call then that packet is discarded and ignored. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Calculate the serial number skew in the data_ready handler when a packet has been received and a connection looked up. The skew is cached in the sk_buff's priority field. The connection highest received serial number is updated at this time also. This can be done without locks or atomic instructions because, at this point, the code is serialised by the socket. This generates more accurate skew data because if the packet is offloaded to a work queue before this is determined, more packets may come in, bumping the highest serial number and thereby increasing the apparent skew. This also removes some unnecessary atomic ops. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Set the connection expiry time when a connection becomes idle rather than doing this in rxrpc_put_connection(). This makes the put path more efficient (it is likely to be called occasionally whilst a connection has outstanding calls because active workqueue items needs to be given a ref). The time is also preset in the connection allocator in case the connection never gets used. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Use a tracepoint to log various skb accounting points to help in debugging refcounting errors. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Drop the channel number (channel) field from the rxrpc_call struct to reduce the size of the call struct. The field is redundant: if the call is attached to a connection, the channel can be obtained from there by AND'ing with RXRPC_CHANNELMASK. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
When clearing a socket, we should clear the securing-in-progress list first, then the accept queue and last the main call tree because that's the order in which a call progresses. Not that a call should move from the accept queue to the main tree whilst we're shutting down a socket, but it a call could possibly move from sequreq to acceptq whilst we're clearing up. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Do a little tidying of the rxrpc_call struct: (1) in_clientflag is no longer compared against the value that's in the packet, so keeping it in this form isn't necessary. Use a flag in flags instead and provide a pair of wrapper functions. (2) We don't read the epoch value, so that can go. (3) Move what remains of the data that were used for hashing up in the struct to be with the channel number. (4) Get rid of the local pointer. We can get at this via the socket struct and we only use this in the procfs viewer. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-