- 27 6月, 2017 1 次提交
-
-
由 Matthias Schiffer 提交于
Add support for extended error reporting. Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net> Acked-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 26 6月, 2017 4 次提交
-
-
由 Marcelo Ricardo Leitner 提交于
RFC 4960 Errata 3.27 identifies that ssthresh should be adjusted to cwnd because otherwise it could cause the transport to lock into congestion avoidance phase specially if ssthresh was previously reduced by some packet drop, leading to poor performance. The Errata says to adjust ssthresh to cwnd only once, though the same goal is achieved by updating it every time we update cwnd too. The caveat is that we could take longer to get back up to speed but that should be compensated by the fact that we don't adjust on RTO basis (as RFC says) but based on Heartbeats, which are usually way longer. See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-rfc4960-errata-01#section-3.27Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Marcelo Ricardo Leitner 提交于
RFC4960 Errata 3.26 identified that at the same time RFC4960 states that cwnd should never grow more than 1*MTU per RTT, Section 7.2.2 was underspecified and as described could allow increasing cwnd more than that. This patch updates it so partial_bytes_acked is maxed to cwnd if flight_size doesn't reach cwnd, protecting it from such case. See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-rfc4960-errata-01#section-3.26Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Marcelo Ricardo Leitner 提交于
As per RFC4960 Errata 3.22, this condition is not needed anymore as it could cause the partial_bytes_acked to not consider the TSNs acked in the Gap Ack Blocks although they were received by the peer successfully. This patch thus drops the check for new Cumulative TSN Ack Point, leaving just the flight_size < cwnd one. See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-rfc4960-errata-01#section-3.22Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Marcelo Ricardo Leitner 提交于
RFC4960 Errata 3.12 says RFC4960 is unclear about the order of adjustments applied to partial_bytes_acked and cwnd in the congestion avoidance phase, and that the actual order should be: partial_bytes_acked is reset to (partial_bytes_acked - cwnd). Next, cwnd is increased by MTU. We were first increasing cwnd, and then subtracting the new value pba, which leads to a different result as pba is smaller than what it should and could cause cwnd to not grow as much. See-also: https://tools.ietf.org/html/draft-ietf-tsvwg-rfc4960-errata-01#section-3.12Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 25 6月, 2017 3 次提交
-
-
由 Mateusz Jurczyk 提交于
Verify that the caller-provided sockaddr structure is large enough to contain the sa_family field, before accessing it in bind() and connect() handlers of the AF_IUCV socket. Since neither syscall enforces a minimum size of the corresponding memory region, very short sockaddrs (zero or one byte long) result in operating on uninitialized memory while referencing .sa_family. Fixes: 52a82e23 ("af_iucv: Validate socket address length in iucv_sock_bind()") Signed-off-by: NMateusz Jurczyk <mjurczyk@google.com> [jwi: removed unneeded null-check for addr] Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Hans Wippel 提交于
Use proper endianness conversion for an skb protocol assignment. Given that IUCV is only available on big endian systems (s390), this simply avoids an endianness warning reported by sparse. Signed-off-by: NHans Wippel <hwippel@linux.vnet.ibm.com> Reviewed-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: NUrsula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: NJulian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Switches and modern SR-IOV enabled NICs may multiplex traffic from Port representators and control messages over single set of hardware queues. Control messages and muxed traffic may need ordered delivery. Those requirements make it hard to comfortably use TC infrastructure today unless we have a way of attaching metadata to skbs at the upper device. Because single set of queues is used for many netdevs stopping TC/sched queues of all of them reliably is impossible and lower device has to retreat to returning NETDEV_TX_BUSY and usually has to take extra locks on the fastpath. This patch attempts to enable port/representative devs to attach metadata to skbs which carry port id. This way representatives can be queueless and all queuing can be performed at the lower netdev in the usual way. Traffic arriving on the port/representative interfaces will be have metadata attached and will subsequently be queued to the lower device for transmission. The lower device should recognize the metadata and translate it to HW specific format which is most likely either a special header inserted before the network headers or descriptor/metadata fields. Metadata is associated with the lower device by storing the netdev pointer along with port id so that if TC decides to redirect or mirror the new netdev will not try to interpret it. This is mostly for SR-IOV devices since switches don't have lower netdevs today. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NSridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: NSimon Horman <horms@verge.net.au> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 24 6月, 2017 6 次提交
-
-
由 Dan Carpenter 提交于
The copy_to_user() function returns the number of bytes remaining but we want to return -EFAULT here. Fixes: 3c4d7559 ("tls: kernel TLS support") Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com> Acked-by: NDave Watson <davejwatson@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
KASAN reports out-of-bound access in proc_dostring() coming from proc_tcp_available_ulp() because in case TCP ULP list is empty the buffer allocated for the response will not have anything printed into it. Set the first byte to zero to avoid strlen() going out-of-bounds. Fixes: 734942cc ("tcp: ULP infrastructure") Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Yonghong Song 提交于
Commit 31fd8581 ("bpf: permits narrower load from bpf program context fields") permits narrower load for certain ctx fields. The commit however will already generate a masking even if the prog-specific ctx conversion produces the result with narrower size. For example, for __sk_buff->protocol, the ctx conversion loads the data into register with 2-byte load. A narrower 2-byte load should not generate masking. For __sk_buff->vlan_present, the conversion function set the result as either 0 or 1, essentially a byte. The narrower 2-byte or 1-byte load should not generate masking. To avoid unnecessary masking, prog-specific *_is_valid_access now passes converted_op_size back to verifier, which indicates the valid data width after perceived future conversion. Based on this information, verifier is able to avoid unnecessary marking. Since we want more information back from prog-specific *_is_valid_access checking, all of them are packed into one data structure for more clarity. Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NYonghong Song <yhs@fb.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Extend the XDP_ATTACHED_* values to include offloaded mode. Let drivers report whether program is installed in the driver or the HW by changing the prog_attached field from bool to u8 (type of the netlink attribute). Exploit the fact that the value of XDP_ATTACHED_DRV is 1, therefore since all drivers currently assign the mode with double negation: mode = !!xdp_prog; no drivers have to be modified. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Add an installation-time flag for requesting that the program be installed only if it can be offloaded to HW. Internally new command for ndo_xdp is added, this way we avoid putting checks into drivers since they all return -EINVAL on an unknown command. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jakub Kicinski 提交于
Pass XDP flags to the xdp ndo. This will allow drivers to look at the mode flags and make decisions about offload. Signed-off-by: NJakub Kicinski <jakub.kicinski@netronome.com> Acked-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 23 6月, 2017 2 次提交
-
-
由 Paolo Abeni 提交于
Michael reported an UDP breakage caused by the commit b65ac446 ("udp: try to avoid 2 cache miss on dequeue"). The function __first_packet_length() can update the checksum bits of the pending skb, making the scratched area out-of-sync, and setting skb->csum, if the skb was previously in need of checksum validation. On later recvmsg() for such skb, checksum validation will be invoked again - due to the wrong udp_skb_csum_unnecessary() value - and will fail, causing the valid skb to be dropped. This change addresses the issue refreshing the scratch area in __first_packet_length() after the possible checksum update. Fixes: b65ac446 ("udp: try to avoid 2 cache miss on dequeue") Reported-by: NMichael Ellerman <mpe@ellerman.id.au> Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
very similar to commit dd99e425 ("udp: prefetch rmem_alloc in udp_queue_rcv_skb()"), this allows saving a cache miss when the BH is bottle-neck for UDP over ipv6 packet processing, e.g. for small packets when a single RX NIC ingress queue is in use. Performances under flood when multiple NIC RX queues used are unaffected, but when a single NIC rx queue is in use, this gives ~8% performance improvement. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 22 6月, 2017 4 次提交
-
-
由 Sowmini Varadhan 提交于
If we are unloading the rds_tcp module, we can set linger to 1 and drop pending packets to accelerate reconnect. The peer will end up resetting the connection based on new generation numbers of the new incarnation, so hanging on to unsent TCP packets via linger is mostly pointless in this case. Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Tested-by: NJenny Xu <jenny.x.xu@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sowmini Varadhan 提交于
The RDS handshake ping probe added by commit 5916e2c1 ("RDS: TCP: Enable multipath RDS for TCP") is sent from rds_sendmsg() before the first data packet is sent to a peer. If the conversation is not bidirectional (i.e., one side is always passive and never invokes rds_sendmsg()) and the passive side restarts its rds_tcp module, a new HS ping probe needs to be sent, so that the number of paths can be re-established. This patch achieves that by sending a HS ping probe from rds_tcp_accept_one() when c_npaths is 0 (i.e., we have not done a handshake probe with this peer yet). Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Tested-by: NJenny Xu <jenny.x.xu@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Chenbo Feng 提交于
Currently in both ipv4 and ipv6 code path, the ack packet received when sk at TCP_NEW_SYN_RECV state is not filtered by socket filter or cgroup filter since it is handled from tcp_child_process and never reaches the tcp_filter inside tcp_v4_rcv or tcp_v6_rcv. Adding a tcp_filter hooks here can make sure all the ingress tcp packet can be correctly filtered. Signed-off-by: NChenbo Feng <fengc@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Daniel Borkmann 提交于
In order to be able to retrieve the attached programs from cls_bpf and act_bpf, we need to expose the prog ids via netlink so that an application can later on get an fd based on the id through the BPF_PROG_GET_FD_BY_ID command, and dump related prog info via BPF_OBJ_GET_INFO_BY_FD command for bpf(2). Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 6月, 2017 18 次提交
-
-
由 David Herrmann 提交于
This adds the new getsockopt(2) option SO_PEERGROUPS on SOL_SOCKET to retrieve the auxiliary groups of the remote peer. It is designed to naturally extend SO_PEERCRED. That is, the underlying data is from the same credentials. Regarding its syntax, it is based on SO_PEERSEC. That is, if the provided buffer is too small, ERANGE is returned and @optlen is updated. Otherwise, the information is copied, @optlen is set to the actual size, and 0 is returned. While SO_PEERCRED (and thus `struct ucred') already returns the primary group, it lacks the auxiliary group vector. However, nearly all access controls (including kernel side VFS and SYSVIPC, but also user-space polkit, DBus, ...) consider the entire set of groups, rather than just the primary group. But this is currently not possible with pure SO_PEERCRED. Instead, user-space has to work around this and query the system database for the auxiliary groups of a UID retrieved via SO_PEERCRED. Unfortunately, there is no race-free way to query the auxiliary groups of the PID/UID retrieved via SO_PEERCRED. Hence, the current user-space solution is to use getgrouplist(3p), which itself falls back to NSS and whatever is configured in nsswitch.conf(3). This effectively checks which groups we *would* assign to the user if it logged in *now*. On normal systems it is as easy as reading /etc/group, but with NSS it can resort to quering network databases (eg., LDAP), using IPC or network communication. Long story short: Whenever we want to use auxiliary groups for access checks on IPC, we need further IPC to talk to the user/group databases, rather than just relying on SO_PEERCRED and the incoming socket. This is unfortunate, and might even result in dead-locks if the database query uses the same IPC as the original request. So far, those recursions / dead-locks have been avoided by using primitive IPC for all crucial NSS modules. However, we want to avoid re-inventing the wheel for each NSS module that might be involved in user/group queries. Hence, we would preferably make DBus (and other IPC that supports access-management based on groups) work without resorting to the user/group database. This new SO_PEERGROUPS ioctl would allow us to make dbus-daemon work without ever calling into NSS. Cc: Michal Sekletar <msekleta@redhat.com> Cc: Simon McVittie <simon.mcvittie@collabora.co.uk> Reviewed-by: NTom Gundersen <teg@jklm.no> Signed-off-by: NDavid Herrmann <dh.herrmann@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Paolo Abeni 提交于
On UDP packets processing, if the BH is the bottle-neck, it always sees a cache miss while updating rmem_alloc; try to avoid it prefetching the value as soon as we have the socket available. Performances under flood with multiple NIC rx queues used are unaffected, but when a single NIC rx queue is in use, this gives ~10% performance improvement. Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Julien Gomes 提交于
Add Netlink notifications on cache reports in ip6mr, in addition to the existing mrt6msg sent to mroute6_sk. Send RTM_NEWCACHEREPORT notifications to RTNLGRP_IPV6_MROUTE_R. MSGTYPE, MIF_ID, SRC_ADDR and DST_ADDR Netlink attributes contain the same data as their equivalent fields in the mrt6msg header. PKT attribute is the packet sent to mroute6_sk, without the added mrt6msg header. Suggested-by: NRyan Halbrook <halbrook@arista.com> Signed-off-by: NJulien Gomes <julien@arista.com> Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Julien Gomes 提交于
Add Netlink notifications on cache reports in ipmr, in addition to the existing igmpmsg sent to mroute_sk. Send RTM_NEWCACHEREPORT notifications to RTNLGRP_IPV4_MROUTE_R. MSGTYPE, VIF_ID, SRC_ADDR and DST_ADDR Netlink attributes contain the same data as their equivalent fields in the igmpmsg header. PKT attribute is the packet sent to mroute_sk, without the added igmpmsg header. Suggested-by: NRyan Halbrook <halbrook@arista.com> Signed-off-by: NJulien Gomes <julien@arista.com> Reviewed-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Julien Gomes 提交于
Add RTNLGRP_{IPV4,IPV6}_MROUTE_R as two new restricted groups for the NETLINK_ROUTE family. Binding to these groups specifically requires CAP_NET_ADMIN to allow multicast of sensitive messages (e.g. mroute cache reports). Suggested-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NJulien Gomes <julien@arista.com> Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arnd Bergmann 提交于
Changing from a memcpy to per-member comparison left the size variable unused: net/ipv4/tcp_ipv4.c: In function 'tcp_md5_do_lookup': net/ipv4/tcp_ipv4.c:910:15: error: unused variable 'size' [-Werror=unused-variable] This does not show up when CONFIG_IPV6 is enabled, but the variable can be removed either way, along with the now unused assignment. Fixes: 6797318e ("tcp: md5: add an address prefix for key lookup") Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 WANG Cong 提交于
Andrey reported a lockdep warning on non-initialized spinlock: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x292/0x395 lib/dump_stack.c:52 register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755 ? 0xffffffffa0000000 __lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255 lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855 __raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135 _raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175 spin_lock_bh ./include/linux/spinlock.h:304 ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076 igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194 ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736 We miss a spin_lock_init() in igmpv3_add_delrec(), probably because previously we never use it on this code path. Since we already unlink it from the global mc_tomb list, it is probably safe not to acquire this spinlock here. It does not harm to have it although, to avoid conditional locking. Fixes: c38b7d32 ("igmp: acquire pmc lock for ip_mc_clear_src()") Reported-by: NAndrey Konovalov <andreyknvl@google.com> Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Serhey Popovych 提交于
Network interface groups support added while ago, however there is no IFLA_GROUP attribute description in policy and netlink message size calculations until now. Add IFLA_GROUP attribute to the policy. Fixes: cbda10fa ("net_device: add support for network device groups") Signed-off-by: NSerhey Popovych <serhe.popovych@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Serhey Popovych 提交于
While commit 73ba57bf ("ipv6: fix backtracking for throw routes") does good job on error propagation to the fib_rules_lookup() in fib rules core framework that also corrects throw routes handling, it does not solve route reference leakage problem happened when we return -EAGAIN to the fib_rules_lookup() and leave routing table entry referenced in arg->result. If rule with matched throw route isn't last matched in the list we overwrite arg->result losing reference on throw route stored previously forever. We also partially revert commit ab997ad4 ("ipv6: fix the incorrect return value of throw route") since we never return routing table entry with dst.error == -EAGAIN when CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point to check for RTF_REJECT flag since it is always set throw route. Fixes: 73ba57bf ("ipv6: fix backtracking for throw routes") Signed-off-by: NSerhey Popovych <serhe.popovych@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
It's a bad thing not to handle errors when updating asoc. The memory allocation failure in any of the functions called in sctp_assoc_update() would cause sctp to work unexpectedly. This patch is to fix it by aborting the asoc and reporting the error when any of these functions fails. Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
local_cork is used to decide if it should uncork asoc outq after processing some cmds, and it is set when replying or sending msgs. local_cork should always have the same value with current asoc q->cork in some way. The thing is when changing to a new asoc by cmd SET_ASOC, local_cork may not be consistent with the current asoc any more. The cmd seqs can be: SCTP_CMD_UPDATE_ASSOC (asoc) SCTP_CMD_REPLY (asoc) SCTP_CMD_SET_ASOC (new_asoc) SCTP_CMD_DELETE_TCB (new_asoc) SCTP_CMD_SET_ASOC (asoc) SCTP_CMD_REPLY (asoc) The 1st REPLY makes OLD asoc q->cork and local_cork both are 1, and the cmd DELETE_TCB clears NEW asoc q->cork and local_cork. After asoc goes back to OLD asoc, q->cork is still 1 while local_cork is 0. The 2nd REPLY will not set local_cork because q->cork is already set and it can't be uncorked and sent out because of this. To keep local_cork consistent with the current asoc q->cork, this patch is to uncork the old asoc if local_cork is set before changing to the new one. Note that the above cmd seqs will be used in the next patch when updating asoc and handling errors in it. Suggested-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
Patch "call inet_add_protocol after register_pernet_subsys in dccp_v4_init" fixed a null pointer dereference issue for dccp_ipv4 module. The same fix is needed for dccp_ipv6 module. Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
Now dccp_ipv4 works as a kernel module. During loading this module, if one dccp packet is being recieved after inet_add_protocol but before register_pernet_subsys in which v4_ctl_sk is initialized, a null pointer dereference may be triggered because of init_net.dccp.v4_ctl_sk is 0x0. Jianlin found this issue when the following call trace occurred: [ 171.950177] BUG: unable to handle kernel NULL pointer dereference at 0000000000000110 [ 171.951007] IP: [<ffffffffc0558364>] dccp_v4_ctl_send_reset+0xc4/0x220 [dccp_ipv4] [...] [ 171.984629] Call Trace: [ 171.984859] <IRQ> [ 171.985061] [ 171.985213] [<ffffffffc0559a53>] dccp_v4_rcv+0x383/0x3f9 [dccp_ipv4] [ 171.985711] [<ffffffff815ca054>] ip_local_deliver_finish+0xb4/0x1f0 [ 171.986309] [<ffffffff815ca339>] ip_local_deliver+0x59/0xd0 [ 171.986852] [<ffffffff810cd7a4>] ? update_curr+0x104/0x190 [ 171.986956] [<ffffffff815c9cda>] ip_rcv_finish+0x8a/0x350 [ 171.986956] [<ffffffff815ca666>] ip_rcv+0x2b6/0x410 [ 171.986956] [<ffffffff810c83b4>] ? task_cputime+0x44/0x80 [ 171.986956] [<ffffffff81586f22>] __netif_receive_skb_core+0x572/0x7c0 [ 171.986956] [<ffffffff810d2c51>] ? trigger_load_balance+0x61/0x1e0 [ 171.986956] [<ffffffff81587188>] __netif_receive_skb+0x18/0x60 [ 171.986956] [<ffffffff8158841e>] process_backlog+0xae/0x180 [ 171.986956] [<ffffffff8158799d>] net_rx_action+0x16d/0x380 [ 171.986956] [<ffffffff81090b7f>] __do_softirq+0xef/0x280 [ 171.986956] [<ffffffff816b6a1c>] call_softirq+0x1c/0x30 This patch is to move inet_add_protocol after register_pernet_subsys in dccp_v4_init, so that v4_ctl_sk is initialized before any incoming dccp packets are processed. Reported-by: NJianlin Shi <jishi@redhat.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Matthias Schiffer 提交于
There is no good reason to keep the flags twice in vxlan_dev and vxlan_config. Signed-off-by: NMatthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 yuan linyu 提交于
Signed-off-by: Nyuan linyu <Linyu.Yuan@alcatel-sbell.com.cn> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 yuan linyu 提交于
follow Johannes Berg, semantic patch file as below, @@ identifier p, p2; expression len; expression skb; type t, t2; @@ ( -p = __skb_put(skb, len); +p = __skb_put_zero(skb, len); | -p = (t)__skb_put(skb, len); +p = __skb_put_zero(skb, len); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, len); | -memset(p, 0, len); ) @@ identifier p; expression len; expression skb; type t; @@ ( -t p = __skb_put(skb, len); +t p = __skb_put_zero(skb, len); ) ... when != p ( -memset(p, 0, len); ) @@ type t, t2; identifier p, p2; expression skb; @@ t *p; ... ( -p = __skb_put(skb, sizeof(t)); +p = __skb_put_zero(skb, sizeof(t)); | -p = (t *)__skb_put(skb, sizeof(t)); +p = __skb_put_zero(skb, sizeof(t)); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, sizeof(*p)); | -memset(p, 0, sizeof(*p)); ) @@ expression skb, len; @@ -memset(__skb_put(skb, len), 0, len); +__skb_put_zero(skb, len); @@ expression skb, len, data; @@ -memcpy(__skb_put(skb, len), data, len); +__skb_put_data(skb, data, len); @@ expression SKB, C, S; typedef u8; identifier fn = {__skb_put}; fresh identifier fn2 = fn ## "_u8"; @@ - *(u8 *)fn(SKB, S) = C; + fn2(SKB, C); Signed-off-by: Nyuan linyu <Linyu.Yuan@alcatel-sbell.com.cn> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sebastian Siewior 提交于
Since commit 217f6974 ("net: busy-poll: allow preemption in sk_busy_loop()") there is an explicit do_softirq() invocation after local_bh_enable() has been invoked. I don't understand why we need this because local_bh_enable() will invoke do_softirq() once the softirq counter reached zero and we have softirq-related work pending. Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Serhey Popovych 提交于
We should avoid marking goto rules unresolved when their target is actually reachable after rule deletion. Consolder following sample scenario: # ip -4 ru sh 0: from all lookup local 32000: from all goto 32100 32100: from all lookup main 32100: from all lookup default 32766: from all lookup main 32767: from all lookup default # ip -4 ru del pref 32100 table main # ip -4 ru sh 0: from all lookup local 32000: from all goto 32100 [unresolved] 32100: from all lookup default 32766: from all lookup main 32767: from all lookup default After removal of first rule with preference 32100 we mark all goto rules as unreachable, even when rule with same preference as removed one still present. Check if next rule with same preference is available and make all rules with goto action pointing to it. Signed-off-by: NSerhey Popovych <serhe.popovych@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 6月, 2017 2 次提交
-
-
由 Xin Long 提交于
Now before dumping a sock in sctp_diag, it only holds the sock while the ep may be already destroyed. It can cause a use-after-free panic when accessing ep->asocs. This patch is to set sctp_sk(sk)->ep NULL in sctp_endpoint_destroy, and check if this ep is already destroyed before dumping this ep. Suggested-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NNeil Horman <nhorman@tuxdrver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gao Feng 提交于
The register_vlan_device would invoke free_netdev directly, when register_vlan_dev failed. It would trigger the BUG_ON in free_netdev if the dev was already registered. In this case, the netdev would be freed in netdev_run_todo later. So add one condition check now. Only when dev is not registered, then free it directly. The following is the part coredump when netdev_upper_dev_link failed in register_vlan_dev. I removed the lines which are too long. [ 411.237457] ------------[ cut here ]------------ [ 411.237458] kernel BUG at net/core/dev.c:7998! [ 411.237484] invalid opcode: 0000 [#1] SMP [ 411.237705] [last unloaded: 8021q] [ 411.237718] CPU: 1 PID: 12845 Comm: vconfig Tainted: G E 4.12.0-rc5+ #6 [ 411.237737] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 [ 411.237764] task: ffff9cbeb6685580 task.stack: ffffa7d2807d8000 [ 411.237782] RIP: 0010:free_netdev+0x116/0x120 [ 411.237794] RSP: 0018:ffffa7d2807dbdb0 EFLAGS: 00010297 [ 411.237808] RAX: 0000000000000002 RBX: ffff9cbeb6ba8fd8 RCX: 0000000000001878 [ 411.237826] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 0000000000000000 [ 411.237844] RBP: ffffa7d2807dbdc8 R08: 0002986100029841 R09: 0002982100029801 [ 411.237861] R10: 0004000100029980 R11: 0004000100029980 R12: ffff9cbeb6ba9000 [ 411.238761] R13: ffff9cbeb6ba9060 R14: ffff9cbe60f1a000 R15: ffff9cbeb6ba9000 [ 411.239518] FS: 00007fb690d81700(0000) GS:ffff9cbebb640000(0000) knlGS:0000000000000000 [ 411.239949] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 411.240454] CR2: 00007f7115624000 CR3: 0000000077cdf000 CR4: 00000000003406e0 [ 411.240936] Call Trace: [ 411.241462] vlan_ioctl_handler+0x3f1/0x400 [8021q] [ 411.241910] sock_ioctl+0x18b/0x2c0 [ 411.242394] do_vfs_ioctl+0xa1/0x5d0 [ 411.242853] ? sock_alloc_file+0xa6/0x130 [ 411.243465] SyS_ioctl+0x79/0x90 [ 411.243900] entry_SYSCALL_64_fastpath+0x1e/0xa9 [ 411.244425] RIP: 0033:0x7fb69089a357 [ 411.244863] RSP: 002b:00007ffcd04e0fc8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 411.245445] RAX: ffffffffffffffda RBX: 00007ffcd04e2884 RCX: 00007fb69089a357 [ 411.245903] RDX: 00007ffcd04e0fd0 RSI: 0000000000008983 RDI: 0000000000000003 [ 411.246527] RBP: 00007ffcd04e0fd0 R08: 0000000000000000 R09: 1999999999999999 [ 411.246976] R10: 000000000000053f R11: 0000000000000202 R12: 0000000000000004 [ 411.247414] R13: 00007ffcd04e1128 R14: 00007ffcd04e2888 R15: 0000000000000001 [ 411.249129] RIP: free_netdev+0x116/0x120 RSP: ffffa7d2807dbdb0 Signed-off-by: NGao Feng <gfree.wind@vip.163.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-