- 10 1月, 2017 24 次提交
-
-
由 Ursula Braun 提交于
copy data to kernel send buffer, and trigger RDMA write Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
send and receive CDC messages (via IB message send and CQE) Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
send and receive LLC messages CONFIRM_LINK (via IB message send and CQE) Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
Prepare the link for RDMA transport: Create a queue pair (QP) and move it into the state Ready-To-Receive (RTR). Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
The base containers for RDMA transport are work requests and completion queue entries processed through Infiniband verbs: * allocate and initialize these areas * map these areas to DMA * implement the basic communication consisting of work request posting and receival of completion queue events Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
* allocate data RMB memory for sending and receiving * size depends on the maximum socket send and receive buffers * allocated RMBs are kept during life time of the owning link group * map the allocated RMBs to DMA Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
* create smc_connection for SMC-sockets * determine suitable link group for a connection * create a new link group if necessary Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
* CLC (Connection Layer Control) handshake Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Thomas Richter 提交于
Connection creation with SMC-R starts through an internal TCP-connection. The Ethernet interface for this TCP-connection is not restricted to the Ethernet interface of a RoCE device. Any existing Ethernet interface belonging to the same physical net can be used, as long as there is a defined relation between the Ethernet interface and some RoCE devices. This relation is defined with the help of an identification string called "Physical Net ID" or short "pnet ID". Information about defined pnet IDs and their related Ethernet interfaces and RoCE devices is stored in the SMC-R pnet table. A pnet table entry consists of the identifying pnet ID and the associated network and IB device. This patch adds pnet table configuration support using the generic netlink message interface referring to network and IB device by their names. Commands exist to add, delete, and display pnet table entries, and to flush or display the entire pnet table. There are cross-checks to verify whether the ethernet interfaces or infiniband devices really exist in the system. If either device is not available, the pnet ID entry is not created. Loss of network devices and IB devices is also monitored; a pnet ID entry is removed when an associated network or IB device is removed. Signed-off-by: NThomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
* create a list of SMC IB-devices Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
* enable smc module loading and unloading * register new socket family * basic smc socket creation and deletion * use backing TCP socket to run CLC (Connection Layer Control) handshake of SMC protocol * Setup for infiniband traffic is implemented in follow-on patches. For now fallback to TCP socket is always used. Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: NUtz Bacher <utz.bacher@de.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ursula Braun 提交于
Direct call of tcp_set_keepalive() function from protocol-agnostic sock_setsockopt() function in net/core/sock.c violates network layering. And newly introduced protocol (SMC-R) will need its own keepalive function. Therefore, add "keepalive" function pointer to "struct proto", and call it from sock_setsockopt() via this pointer. Signed-off-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: NUtz Bacher <utz.bacher@de.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
It is possible to avoid the atomic operation in icmp{v6,}_xmit_lock, by checking the sysctl_icmp_msgs_per_sec ratelimit before these calls, as pointed out by Eric Dumazet, but the BH disabled state must be correct. The icmp_global_allow() call states it must be called with BH disabled. This protection was given by the calls icmp_xmit_lock and icmpv6_xmit_lock. Thus, split out local_bh_disable/enable from these functions and maintain it explicitly at callers. Suggested-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
This patch split the global and per (inet)peer ICMP-reply limiter code, and moves the global limit check to earlier in the packet processing path. Thus, avoid spending cycles on ICMP replies that gets limited/suppressed anyhow. The global ICMP rate limiter icmp_global_allow() is a good solution, it just happens too late in the process. The kernel goes through the full route lookup (return path) for the ICMP message, before taking the rate limit decision of not sending the ICMP reply. Details: The kernels global rate limiter for ICMP messages got added in commit 4cdf507d ("icmp: add a global rate limitation"). It is a token bucket limiter with a global lock. It brilliantly avoids locking congestion by only updating when 20ms (HZ/50) were elapsed. It can then avoids taking lock when credit is exhausted (when under pressure) and time constraint for refill is not yet meet. Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jesper Dangaard Brouer 提交于
This reverts commit 9a99d4a5 ("icmp: avoid allocating large struct on stack"), because struct icmp_bxm no really a large struct, and allocating and free of this small 112 bytes hurts performance. Fixes: 9a99d4a5 ("icmp: avoid allocating large struct on stack") Signed-off-by: NJesper Dangaard Brouer <brouer@redhat.com> Acked-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Fainelli 提交于
Now that we have properly encapsulated and made drivers utilize exported functions, we can switch dsa_switch_ops to be a annotated with const. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Florian Fainelli 提交于
In preparation for making struct dsa_switch_ops const, encapsulate it within a dsa_switch_driver which has a list pointer and a pointer to dsa_switch_ops. This allows us to take the list_head pointer out of dsa_switch_ops, which is written to by {un,}register_switch_driver. Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Davide Caratti 提交于
modify act_csum to compute crc32c on IPv4/IPv6 packets having SCTP in their payload, and extend UAPI definitions accordingly. Signed-off-by: NDavide Caratti <dcaratti@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Davide Caratti 提交于
LIBCRC32C is needed to compute crc32c on SCTP packets. Signed-off-by: NDavide Caratti <dcaratti@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexandru Moise 提交于
This struct member is already initialized to zero upon root_ht's allocation via kzalloc(). Signed-off-by: NAlexandru Moise <00moses.alexander00@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason A. Donenfeld 提交于
SHA1 is slower and less secure than SipHash, and so replacing syncookie generation with SipHash makes natural sense. Some BSDs have been doing this for several years in fact. The speedup should be similar -- and even more impressive -- to the speedup from the sequence number fix in this series. Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jason A. Donenfeld 提交于
This gives a clear speed and security improvement. Siphash is both faster and is more solid crypto than the aging MD5. Rather than manually filling MD5 buffers, for IPv6, we simply create a layout by a simple anonymous struct, for which gcc generates rather efficient code. For IPv4, we pass the values directly to the short input convenience functions. 64-bit x86_64: [ 1.683628] secure_tcpv6_sequence_number_md5# cycles: 99563527 [ 1.717350] secure_tcp_sequence_number_md5# cycles: 92890502 [ 1.741968] secure_tcpv6_sequence_number_siphash# cycles: 67825362 [ 1.762048] secure_tcp_sequence_number_siphash# cycles: 67485526 32-bit x86: [ 1.600012] secure_tcpv6_sequence_number_md5# cycles: 103227892 [ 1.634219] secure_tcp_sequence_number_md5# cycles: 94732544 [ 1.669102] secure_tcpv6_sequence_number_siphash# cycles: 96299384 [ 1.700165] secure_tcp_sequence_number_siphash# cycles: 86015473 Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Miller <davem@davemloft.net> Cc: David Laight <David.Laight@aculab.com> Cc: Tom Herbert <tom@herbertland.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
Nothing about the route lookup requires bottom half to be disabled. Remove the local_bh_disable ... local_bh_enable around ip_route_input. This appears to be a vestige of days gone by as it has been there since the beginning of git time. Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 yuan linyu 提交于
sock_init() call it but not check it's return value, so change it to void return and add an internal BUG_ON() check. Signed-off-by: Nyuan linyu <Linyu.Yuan@alcatel-sbell.com.cn> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 1月, 2017 9 次提交
-
-
由 David Howells 提交于
Allow listen() with a backlog of 0 to be used to disable listening on an AF_RXRPC socket. This also releases any preallocation, thereby making it easier for a kernel service to account for all allocated call structures when shutting down the service. The socket cannot thereafter have listening reenabled, but must rather be closed and reopened. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 Willem de Bruijn 提交于
The tc_from field fulfills two roles. It encodes whether a packet was redirected by an act_mirred device and, if so, whether act_mirred was called on ingress or egress. Split it into separate fields. The information is needed by the special IFB loop, where packets are taken out of the normal path by act_mirred, forwarded to IFB, then reinjected at their original location (ingress or egress) by IFB. The IFB device cannot use skb->tc_at_ingress, because that may have been overwritten as the packet travels from act_mirred to ifb_xmit, when it passes through tc_classify on the IFB egress path. Cache this value in skb->tc_from_ingress. That field is valid only if a packet arriving at ifb_xmit came from act_mirred. Other packets can be crafted to reach ifb_xmit. These must be dropped. Set tc_redirected on redirection and drop all packets that do not have this bit set. Both fields are set only on cloned skbs in tc actions, so original packet sources do not have to clear the bit when reusing packets (notably, pktgen and octeon). Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
Field tc_at is used only within tc actions to distinguish ingress from egress processing. A single bit is sufficient for this purpose. Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
Extract the remaining two fields from tc_verd and remove the __u16 completely. TC_AT and TC_FROM are converted to equivalent two-bit integer fields tc_at and tc_from. Where possible, use existing helper skb_at_tc_ingress when reading tc_at. Introduce helper skb_reset_tc to clear fields. Not documenting tc_from and tc_at, because they will be replaced with single bit fields in follow-on patches. Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
Packets sent by the IFB device skip subsequent tc classification. A single bit governs this state. Move it out of tc_verd in anticipation of removing that __u16 completely. The new bitfield tc_skip_classify temporarily uses one bit of a hole, until tc_verd is removed completely in a follow-up patch. Remove the bit hole comment. It could be 2, 3, 4 or 5 bits long. With that many options, little value in documenting it. Introduce a helper function to deduplicate the logic in the two sites that check this bit. The field tc_skip_classify is set only in IFB on skbs cloned in act_mirred, so original packet sources do not have to clear the bit when reusing packets (notably, pktgen and octeon). Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Willem de Bruijn 提交于
This field is no longer kept in tc_verd. Remove it from the global definition of that struct. Signed-off-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 stephen hemminger 提交于
The network device operation for reading statistics is only called in one place, and it ignores the return value. Having a structure return value is potentially confusing because some future driver could incorrectly assume that the return value was used. Fix all drivers with ndo_get_stats64 to have a void function. Signed-off-by: NStephen Hemminger <sthemmin@microsoft.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
fl4 arg is not used; remove it. Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
ipmr_get_route has 1 caller and the nowait arg is 0. Remove the arg and simplify ipmr_get_route accordingly. Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 08 1月, 2017 1 次提交
-
-
由 Vivien Didelot 提交于
Isolate the HWMON support in DSA in its own file. Currently only the legacy DSA code is concerned. Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: NAndrew Lunn <andrew@lunn.ch> Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 07 1月, 2017 6 次提交
-
-
由 Paul Moore 提交于
When we added CALIPSO support in Linux v4.8 we forgot to add it to the list of supported protocols with display at boot. Signed-off-by: NPaul Moore <paul@paul-moore.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Guillaume Nault 提交于
Split conditions, so that each test becomes clearer. Also, for l2tp_ip, check if "laddr" is 0. This prevents a socket from binding to the unspecified address when other sockets are already bound using the same device (if any), connection ID and namespace. Same thing for l2tp_ip6: add ipv6_addr_any(laddr) and ipv6_addr_any(raddr) tests to ensure that an IPv6 unspecified address passed as parameter is properly treated a wildcard. Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Guillaume Nault 提交于
If "l2tp" was NULL, that'd mean "sk" is NULL too. This can't happen since "sk" is returned by sk_for_each_bound(). Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Guillaume Nault 提交于
Add const qualifier wherever possible for __l2tp_ip_bind_lookup() and __l2tp_ip6_bind_lookup(). Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Guillaume Nault 提交于
addr_len's value has already been verified at this point. Signed-off-by: NGuillaume Nault <g.nault@alphalink.fr> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Larger than supported value can lead to array read/write overflow. Reported-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-