- 16 10月, 2018 6 次提交
-
-
由 David Howells 提交于
Fix an uninitialised variable introduced by the last patch. This can cause a crash when a new call comes in to a local service, such as when an AFS fileserver calls back to the local cache manager. Fixes: c1e15b49 ("rxrpc: Fix the packet reception routine") Signed-off-by: NDavid Howells <dhowells@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Maloy 提交于
In the commit referred to below we added link tolerance as an additional criteria for declaring broadcast transmission "stale" and resetting the unicast links to the affected node. Unfortunately, this 'improvement' introduced two bugs, which each and one alone cause only limited problems, but combined lead to seemingly stochastic unicast link resets, depending on the amount of broadcast traffic transmitted. The first issue, a missing initialization of the 'tolerance' field of the receiver broadcast link, was recently fixed by commit 047491ea ("tipc: set link tolerance correctly in broadcast link"). Ths second issue, where we omit to reset the 'stale_cnt' field of the same link after a 'stale' period is over, leads to this counter accumulating over time, and in the absence of the 'tolerance' criteria leads to the above described symptoms. This commit adds the missing initialization. Fixes: a4dc70d4 ("tipc: extend link reset criteria for stale packet retransmission") Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Cong Wang 提交于
WHen an llc sock is added into the sk_laddr_hash of an llc_sap, it is not marked with SOCK_RCU_FREE. This causes that the sock could be freed while it is still being read by __llc_lookup_established() with RCU read lock. sock is refcounted, but with RCU read lock, nothing prevents the readers getting a zero refcnt. Fix it by setting SOCK_RCU_FREE in llc_sap_add_socket(). Reported-by: syzbot+11e05f04c15e03be5254@syzkaller.appspotmail.com Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Davide Caratti 提交于
Similarly to what has been done in 8b4c3cdd ("net: sched: Add policy validation for tc attributes"), fix classifier code to add validation of TCA_CHAIN and TCA_KIND netlink attributes. tested with: # ./tdc.py -c filter v2: Let sch_api and cls_api share nla_policy they have in common, thanks to David Ahern. v3: Avoid EXPORT_SYMBOL(), as validation of those attributes is not done by TC modules, thanks to Cong Wang. While at it, restore the 'Delete / get qdisc' comment to its orginal position, just above tc_get_qdisc() function prototype. Fixes: 5bc17018 ("net: sched: introduce multichain support for filters") Signed-off-by: NDavide Caratti <dcaratti@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wenwen Wang 提交于
In dev_ethtool(), the eth command 'ethcmd' is firstly copied from the use-space buffer 'useraddr' and checked to see whether it is ETHTOOL_PERQUEUE. If yes, the sub-command 'sub_cmd' is further copied from the user space. Otherwise, 'sub_cmd' is the same as 'ethcmd'. Next, according to 'sub_cmd', a permission check is enforced through the function ns_capable(). For example, the permission check is required if 'sub_cmd' is ETHTOOL_SCOALESCE, but it is not necessary if 'sub_cmd' is ETHTOOL_GCOALESCE, as suggested in the comment "Allow some commands to be done by anyone". The following execution invokes different handlers according to 'ethcmd'. Specifically, if 'ethcmd' is ETHTOOL_PERQUEUE, ethtool_set_per_queue() is called. In ethtool_set_per_queue(), the kernel object 'per_queue_opt' is copied again from the user-space buffer 'useraddr' and 'per_queue_opt.sub_command' is used to determine which operation should be performed. Given that the buffer 'useraddr' is in the user space, a malicious user can race to change the sub-command between the two copies. In particular, the attacker can supply ETHTOOL_PERQUEUE and ETHTOOL_GCOALESCE to bypass the permission check in dev_ethtool(). Then before ethtool_set_per_queue() is called, the attacker changes ETHTOOL_GCOALESCE to ETHTOOL_SCOALESCE. In this way, the attacker can bypass the permission check and execute ETHTOOL_SCOALESCE. This patch enforces a check in ethtool_set_per_queue() after the second copy from 'useraddr'. If the sub-command is different from the one obtained in the first copy in dev_ethtool(), an error code EINVAL will be returned. Fixes: f38d138a ("net/ethtool: support set coalesce per queue") Signed-off-by: NWenwen Wang <wang6495@umn.edu> Reviewed-by: NMichal Kubecek <mkubecek@suse.cz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Wenwen Wang 提交于
In ethtool_get_rxnfc(), the eth command 'cmd' is compared against 'ETHTOOL_GRXFH' to see whether it is necessary to adjust the variable 'info_size'. Then the whole structure of 'info' is copied from the user-space buffer 'useraddr' with 'info_size' bytes. In the following execution, 'info' may be copied again from the buffer 'useraddr' depending on the 'cmd' and the 'info.flow_type'. However, after these two copies, there is no check between 'cmd' and 'info.cmd'. In fact, 'cmd' is also copied from the buffer 'useraddr' in dev_ethtool(), which is the caller function of ethtool_get_rxnfc(). Given that 'useraddr' is in the user space, a malicious user can race to change the eth command in the buffer between these copies. By doing so, the attacker can supply inconsistent data and cause undefined behavior because in the following execution 'info' will be passed to ops->get_rxnfc(). This patch adds a necessary check on 'info.cmd' and 'cmd' to confirm that they are still same after the two copies in ethtool_get_rxnfc(). Otherwise, an error code EINVAL will be returned. Signed-off-by: NWenwen Wang <wang6495@umn.edu> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 12 10月, 2018 1 次提交
-
-
由 Ying Xue 提交于
When booting kernel with LOCKDEP option, below warning info was found: WARNING: possible recursive locking detected 4.19.0-rc7+ #14 Not tainted -------------------------------------------- swapper/0/1 is trying to acquire lock: 00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh include/linux/spinlock.h:334 [inline] 00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850 but task is already holding lock: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh include/linux/spinlock.h:334 [inline] 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&list->lock)->rlock#4); lock(&(&list->lock)->rlock#4); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by swapper/0/1: #0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at: register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051 #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh include/linux/spinlock.h:334 [inline] #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849 stack backtrace: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1af/0x295 lib/dump_stack.c:113 print_deadlock_bug kernel/locking/lockdep.c:1759 [inline] check_deadlock kernel/locking/lockdep.c:1803 [inline] validate_chain kernel/locking/lockdep.c:2399 [inline] __lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411 lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline] _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168 spin_lock_bh include/linux/spinlock.h:334 [inline] tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850 tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526 tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521 tipc_init_net+0x472/0x610 net/tipc/core.c:82 ops_init+0xf7/0x520 net/core/net_namespace.c:129 __register_pernet_operations net/core/net_namespace.c:940 [inline] register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011 register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052 tipc_init+0x83/0x104 net/tipc/core.c:140 do_one_initcall+0x109/0x70a init/main.c:885 do_initcall_level init/main.c:953 [inline] do_initcalls init/main.c:961 [inline] do_basic_setup init/main.c:979 [inline] kernel_init_freeable+0x4bd/0x57f init/main.c:1144 kernel_init+0x13/0x180 init/main.c:1063 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413 The reason why the noise above was complained by LOCKDEP is because we nested to hold l->wakeupq.lock and l->inputq->lock in tipc_link_reset function. In fact it's unnecessary to move skb buffer from l->wakeupq queue to l->inputq queue while holding the two locks at the same time. Instead, we can move skb buffers in l->wakeupq queue to a temporary list first and then move the buffers of the temporary list to l->inputq queue, which is also safe for us. Fixes: 3f32d0be ("tipc: lock wakeup & inputq at tipc_link_reset()") Reported-by: NDmitry Vyukov <dvyukov@google.com> Signed-off-by: NYing Xue <ying.xue@windriver.com> Acked-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 11 10月, 2018 11 次提交
-
-
由 Björn Töpel 提交于
The XSKMAP update and delete functions called synchronize_net(), which can sleep. It is not allowed to sleep during an RCU read section. Instead we need to make sure that the sock sk_destruct (xsk_destruct) function is asynchronously called after an RCU grace period. Setting the SOCK_RCU_FREE flag for XDP sockets takes care of this. Fixes: fbfc504a ("bpf: introduce new bpf AF_XDP map type BPF_MAP_TYPE_XSKMAP") Reported-by: NEric Dumazet <eric.dumazet@gmail.com> Signed-off-by: NBjörn Töpel <bjorn.topel@intel.com> Acked-by: NSong Liu <songliubraving@fb.com> Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
-
由 Parthasarathy Bhuvaragan 提交于
In tipc_sk_filter_rcv(), when we detect protocol messages with error we call tipc_sk_conn_proto_rcv() and let it reset the connection and notify the socket by calling sk->sk_state_change(). However, tipc_sk_filter_rcv() may have been called from the function tipc_backlog_rcv(), in which case the socket lock is held and the socket already awake. This means that the sk_state_change() call is ignored and the error notification lost. Now the receive queue will remain empty and the socket sleeps forever. In this commit, we convert the protocol message into a connection abort message and enqueue it into the socket's receive queue. By this addition to the above state change we cover all conditions. Acked-by: NYing Xue <ying.xue@windriver.com> Signed-off-by: NParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Maloy 提交于
In the patch referred to below we added link tolerance as an additional criteria for declaring broadcast transmission "stale" and resetting the affected links. However, the 'tolerance' field of the broadcast link is never set, and remains at zero. This renders the whole commit without the intended improving effect, but luckily also with no negative effect. In this commit we add the missing initialization. Fixes: a4dc70d4 ("tipc: extend link reset criteria for stale packet retransmission") Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sabrina Dubroca 提交于
When an MTU update with PMTU smaller than net.ipv4.route.min_pmtu is received, we must clamp its value. However, we can receive a PMTU exception with PMTU < old_mtu < ip_rt_min_pmtu, which would lead to an increase in PMTU. To fix this, take the smallest of the old MTU and ip_rt_min_pmtu. Before this patch, in case of an update, the exception's MTU would always change. Now, an exception can have only its lock flag updated, but not the MTU, so we need to add a check on locking to the following "is this exception getting updated, or close to expiring?" test. Fixes: d52e5a7e ("ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu") Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Reviewed-by: NStefano Brivio <sbrivio@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Sabrina Dubroca 提交于
Since commit 5aad1de5 ("ipv4: use separate genid for next hop exceptions"), exceptions get deprecated separately from cached routes. In particular, administrative changes don't clear PMTU anymore. As Stefano described in commit e9fa1495 ("ipv6: Reflect MTU changes on PMTU of exceptions for MTU-less routes"), the PMTU discovered before the local MTU change can become stale: - if the local MTU is now lower than the PMTU, that PMTU is now incorrect - if the local MTU was the lowest value in the path, and is increased, we might discover a higher PMTU Similarly to what commit e9fa1495 did for IPv6, update PMTU in those cases. If the exception was locked, the discovered PMTU was smaller than the minimal accepted PMTU. In that case, if the new local MTU is smaller than the current PMTU, let PMTU discovery figure out if locking of the exception is still needed. To do this, we need to know the old link MTU in the NETDEV_CHANGEMTU notifier. By the time the notifier is called, dev->mtu has been changed. This patch adds the old MTU as additional information in the notifier structure, and a new call_netdevice_notifiers_u32() function. Fixes: 5aad1de5 ("ipv4: use separate genid for next hop exceptions") Signed-off-by: NSabrina Dubroca <sd@queasysnail.net> Reviewed-by: NStefano Brivio <sbrivio@redhat.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Mike Rapoport 提交于
The fib6_info_alloc() function allocates percpu memory to hold per CPU pointers to rt6_info, but this memory is never freed. Fix it. Fixes: a64efe14 ("net/ipv6: introduce fib6_info struct and helpers") Signed-off-by: NMike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Ka-Cheong Poon 提交于
In rds_send_mprds_hash(), if the calculated hash value is non-zero and the MPRDS connections are not yet up, it will wait. But it should not wait if the send is non-blocking. In this case, it should just use the base c_path for sending the message. Signed-off-by: NKa-Cheong Poon <ka-cheong.poon@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
syzbot managed to crash in skb_checksum_help() [1] : BUG_ON(offset + sizeof(__sum16) > skb_headlen(skb)); Root cause is the following check in skb_partial_csum_set() if (unlikely(start > skb_headlen(skb)) || unlikely((int)start + off > skb_headlen(skb) - 2)) return false; If skb_headlen(skb) is 1, then (skb_headlen(skb) - 2) becomes 0xffffffff and the check fails to detect that ((int)start + off) is off the limit, since the compare is unsigned. When we fix that, then the first condition (start > skb_headlen(skb)) becomes obsolete. Then we should also check that (skb_headroom(skb) + start) wont overflow 16bit field. [1] kernel BUG at net/core/dev.c:2880! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 7330 Comm: syz-executor4 Not tainted 4.19.0-rc6+ #253 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:skb_checksum_help+0x9e3/0xbb0 net/core/dev.c:2880 Code: 85 00 ff ff ff 48 c1 e8 03 42 80 3c 28 00 0f 84 09 fb ff ff 48 8b bd 00 ff ff ff e8 97 a8 b9 fb e9 f8 fa ff ff e8 2d 09 76 fb <0f> 0b 48 8b bd 28 ff ff ff e8 1f a8 b9 fb e9 b1 f6 ff ff 48 89 cf RSP: 0018:ffff8801d83a6f60 EFLAGS: 00010293 RAX: ffff8801b9834380 RBX: ffff8801b9f8d8c0 RCX: ffffffff8608c6d7 RDX: 0000000000000000 RSI: ffffffff8608cc63 RDI: 0000000000000006 RBP: ffff8801d83a7068 R08: ffff8801b9834380 R09: 0000000000000000 R10: ffff8801d83a76d8 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000010001 R14: 000000000000ffff R15: 00000000000000a8 FS: 00007f1a66db5700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f7d77f091b0 CR3: 00000001ba252000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: skb_csum_hwoffload_help+0x8f/0xe0 net/core/dev.c:3269 validate_xmit_skb+0xa2a/0xf30 net/core/dev.c:3312 __dev_queue_xmit+0xc2f/0x3950 net/core/dev.c:3797 dev_queue_xmit+0x17/0x20 net/core/dev.c:3838 packet_snd net/packet/af_packet.c:2928 [inline] packet_sendmsg+0x422d/0x64c0 net/packet/af_packet.c:2953 Fixes: 5ff8dda3 ("net: Ensure partial checksum offset is inside the skb head") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Moshe Shemesh 提交于
Devlink string param buffer is allocated at the size of DEVLINK_PARAM_MAX_STRING_VALUE. Add helper function which makes sure this size is not exceeded. Renamed DEVLINK_PARAM_MAX_STRING_VALUE to __DEVLINK_PARAM_MAX_STRING_VALUE to emphasize that it should be used by devlink only. The driver should use the helper function instead to verify it doesn't exceed the allowed length. Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Moshe Shemesh 提交于
Driverinit configuration mode value is held by devlink to enable the driver fetch the value after reload command. In case the param type is string devlink should copy the value from driver string buffer to devlink string buffer on devlink_param_driverinit_value_set() and vice-versa on devlink_param_driverinit_value_get(). Fixes: ec01aeb1 ("devlink: Add support for get/set driverinit value") Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Acked-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Moshe Shemesh 提交于
In case devlink param type is string, it needs to copy the string value it got from the input to devlink_param_value. Fixes: e3b7ca18 ("devlink: Add param set command") Signed-off-by: NMoshe Shemesh <moshe@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 09 10月, 2018 3 次提交
-
-
由 David Howells 提交于
The rxrpc_input_packet() function and its call tree was built around the assumption that data_ready() handler called from UDP to inform a kernel service that there is data to be had was non-reentrant. This means that certain locking could be dispensed with. This, however, turns out not to be the case with a multi-queue network card that can deliver packets to multiple cpus simultaneously. Each of those cpus can be in the rxrpc_input_packet() function at the same time. Fix by adding or changing some structure members: (1) Add peer->rtt_input_lock to serialise access to the RTT buffer. (2) Make conn->service_id into a 32-bit variable so that it can be cmpxchg'd on all arches. (3) Add call->input_lock to serialise access to the Rx/Tx state. Note that although the Rx and Tx states are (almost) entirely separate, there's no point completing the separation and having separate locks since it's a bi-phasal RPC protocol rather than a bi-direction streaming protocol. Data transmission and data reception do not take place simultaneously on any particular call. and making the following functional changes: (1) In rxrpc_input_data(), hold call->input_lock around the core to prevent simultaneous producing of packets into the Rx ring and updating of tracking state for a particular call. (2) In rxrpc_input_ping_response(), only read call->ping_serial once, and check it before checking RXRPC_CALL_PINGING as that's a cheaper test. The bit test and bit clear can then be combined. No further locking is needed here. (3) In rxrpc_input_ack(), take call->input_lock after we've parsed much of the ACK packet. The superseded ACK check is then done both before and after the lock is taken. The handing of ackinfo data is split, parsing before the lock is taken and processing with it held. This is keyed on rxMTU being non-zero. Congestion management is also done within the locked section. (4) In rxrpc_input_ackall(), take call->input_lock around the Tx window rotation. The ACKALL packet carries no information and is only really useful after all packets have been transmitted since it's imprecise. (5) In rxrpc_input_implicit_end_call(), we use rx->incoming_lock to prevent calls being simultaneously implicitly ended on two cpus and also to prevent any races with incoming call setup. (6) In rxrpc_input_packet(), use cmpxchg() to effect the service upgrade on a connection. It is only permitted to happen once for a connection. (7) In rxrpc_new_incoming_call(), we have to recheck the routing inside rx->incoming_lock to see if someone else set up the call, connection or peer whilst we were getting there. We can't trust the values from the earlier routing check unless we pin refs on them - which we want to avoid. Further, we need to allow for an incoming call to have its state changed on another CPU between us making it live and us adjusting it because the conn is now in the RXRPC_CONN_SERVICE state. (8) In rxrpc_peer_add_rtt(), take peer->rtt_input_lock around the access to the RTT buffer. Don't need to lock around setting peer->rtt. For reference, the inventory of state-accessing or state-altering functions used by the packet input procedure is: > rxrpc_input_packet() * PACKET CHECKING * ROUTING > rxrpc_post_packet_to_local() > rxrpc_find_connection_rcu() - uses RCU > rxrpc_lookup_peer_rcu() - uses RCU > rxrpc_find_service_conn_rcu() - uses RCU > idr_find() - uses RCU * CONNECTION-LEVEL PROCESSING - Service upgrade - Can only happen once per conn ! Changed to use cmpxchg > rxrpc_post_packet_to_conn() - Setting conn->hi_serial - Probably safe not using locks - Maybe use cmpxchg * CALL-LEVEL PROCESSING > Old-call checking > rxrpc_input_implicit_end_call() > rxrpc_call_completed() > rxrpc_queue_call() ! Need to take rx->incoming_lock > __rxrpc_disconnect_call() > rxrpc_notify_socket() > rxrpc_new_incoming_call() - Uses rx->incoming_lock for the entire process - Might be able to drop this earlier in favour of the call lock > rxrpc_incoming_call() ! Conflicts with rxrpc_input_implicit_end_call() > rxrpc_send_ping() - Don't need locks to check rtt state > rxrpc_propose_ACK * PACKET DISTRIBUTION > rxrpc_input_call_packet() > rxrpc_input_data() * QUEUE DATA PACKET ON CALL > rxrpc_reduce_call_timer() - Uses timer_reduce() ! Needs call->input_lock() > rxrpc_receiving_reply() ! Needs locking around ack state > rxrpc_rotate_tx_window() > rxrpc_end_tx_phase() > rxrpc_proto_abort() > rxrpc_input_dup_data() - Fills the Rx buffer - rxrpc_propose_ACK() - rxrpc_notify_socket() > rxrpc_input_ack() * APPLY ACK PACKET TO CALL AND DISCARD PACKET > rxrpc_input_ping_response() - Probably doesn't need any extra locking ! Need READ_ONCE() on call->ping_serial > rxrpc_input_check_for_lost_ack() - Takes call->lock to consult Tx buffer > rxrpc_peer_add_rtt() ! Needs to take a lock (peer->rtt_input_lock) ! Could perhaps manage with cmpxchg() and xadd() instead > rxrpc_input_requested_ack - Consults Tx buffer ! Probably needs a lock > rxrpc_peer_add_rtt() > rxrpc_propose_ack() > rxrpc_input_ackinfo() - Changes call->tx_winsize ! Use cmpxchg to handle change ! Should perhaps track serial number - Uses peer->lock to record MTU specification changes > rxrpc_proto_abort() ! Need to take call->input_lock > rxrpc_rotate_tx_window() > rxrpc_end_tx_phase() > rxrpc_input_soft_acks() - Consults the Tx buffer > rxrpc_congestion_management() - Modifies the Tx annotations ! Needs call->input_lock() > rxrpc_queue_call() > rxrpc_input_abort() * APPLY ABORT PACKET TO CALL AND DISCARD PACKET > rxrpc_set_call_completion() > rxrpc_notify_socket() > rxrpc_input_ackall() * APPLY ACKALL PACKET TO CALL AND DISCARD PACKET ! Need to take call->input_lock > rxrpc_rotate_tx_window() > rxrpc_end_tx_phase() > rxrpc_reject_packet() There are some functions used by the above that queue the packet, after which the procedure is terminated: - rxrpc_post_packet_to_local() - local->event_queue is an sk_buff_head - local->processor is a work_struct - rxrpc_post_packet_to_conn() - conn->rx_queue is an sk_buff_head - conn->processor is a work_struct - rxrpc_reject_packet() - local->reject_queue is an sk_buff_head - local->processor is a work_struct And some that offload processing to process context: - rxrpc_notify_socket() - Uses RCU lock - Uses call->notify_lock to call call->notify_rx - Uses call->recvmsg_lock to queue recvmsg side - rxrpc_queue_call() - call->processor is a work_struct - rxrpc_propose_ACK() - Uses call->lock to wrap __rxrpc_propose_ACK() And a bunch that complete a call, all of which use call->state_lock to protect the call state: - rxrpc_call_completed() - rxrpc_set_call_completion() - rxrpc_abort_call() - rxrpc_proto_abort() - Also uses rxrpc_queue_call() Fixes: 17926a79 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Fix connection-level abort handling to cache the abort and error codes properly so that a new incoming call can be properly aborted if it races with the parent connection being aborted by another CPU. The abort_code and error parameters can then be dropped from rxrpc_abort_calls(). Fixes: f5c17aae ("rxrpc: Calls should only have one terminal state") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Move the out-of-order and duplicate ACK packet check to before the call to rxrpc_input_ackinfo() so that the receive window size and MTU size are only checked in the latest ACK packet and don't regress. Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
- 08 10月, 2018 6 次提交
-
-
由 David Howells 提交于
Carry the call state out of the locked section in rxrpc_rotate_tx_window() rather than sampling it afterwards. This is only used to select tracepoint data, but could have changed by the time we do the tracepoint. Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
We should only call the function to end a call's Tx phase if we rotated the marked-last packet out of the transmission buffer. Make rxrpc_rotate_tx_window() return an indication of whether it just rotated the packet marked as the last out of the transmit buffer, carrying the information out of the locked section in that function. We can then check the return value instead of examining RXRPC_CALL_TX_LAST. Fixes: 70790dbe ("rxrpc: Pass the last Tx packet marker in the annotation buffer") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
We don't need to take the RCU read lock in the rxrpc packet receive function because it's held further up the stack in the IP input routine around the UDP receive routines. Fix this by dropping the RCU read lock calls from rxrpc_input_packet(). This simplifies the code. Fixes: 70790dbe ("rxrpc: Pass the last Tx packet marker in the annotation buffer") Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 David Howells 提交于
Use the UDP encap_rcv hook to cut the bit out of the rxrpc packet reception in which a packet is placed onto the UDP receive queue and then immediately removed again by rxrpc. Going via the queue in this manner seems like it should be unnecessary. This does, however, require the invention of a value to place in encap_type as that's one of the conditions to switch packets out to the encap_rcv hook. Possibly the value doesn't actually matter for anything other than sockopts on the UDP socket, which aren't accessible outside of rxrpc anyway. This seems to cut a bit of time out of the time elapsed between each sk_buff being timestamped and turning up in rxrpc (the final number in the following trace excerpts). I measured this by making the rxrpc_rx_packet trace point print the time elapsed between the skb being timestamped and the current time (in ns), e.g.: ... 424.278721: rxrpc_rx_packet: ... ACK 25026 So doing a 512MiB DIO read from my test server, with an unmodified kernel: N min max sum mean stddev 27605 2626 7581 7.83992e+07 2840.04 181.029 and with the patch applied: N min max sum mean stddev 27547 1895 12165 6.77461e+07 2459.29 255.02 Signed-off-by: NDavid Howells <dhowells@redhat.com>
-
由 Al Viro 提交于
cls_u32.c misuses refcounts for struct tc_u_hnode - it counts references via ->hlist and via ->tp_root together. u32_destroy() drops the former and, in case when there had been links, leaves the sucker on the list. As the result, there's nothing to protect it from getting freed once links are dropped. That also makes the "is it busy" check incapable of catching the root hnode - it *is* busy (there's a reference from tp), but we don't see it as something separate. "Is it our root?" check partially covers that, but the problem exists for others' roots as well. AFAICS, the minimal fix preserving the existing behaviour (where it doesn't include oopsen, that is) would be this: * count tp->root and tp_c->hlist as separate references. I.e. have u32_init() set refcount to 2, not 1. * in u32_destroy() we always drop the former; in u32_destroy_hnode() - the latter. That way we have *all* references contributing to refcount. List removal happens in u32_destroy_hnode() (called only when ->refcnt is 1) an in u32_destroy() in case of tc_u_common going away, along with everything reachable from it. IOW, that way we know that u32_destroy_key() won't free something still on the list (or pointed to by someone's ->root). Reproducer: tc qdisc add dev eth0 ingress tc filter add dev eth0 parent ffff: protocol ip prio 100 handle 1: \ u32 divisor 1 tc filter add dev eth0 parent ffff: protocol ip prio 200 handle 2: \ u32 divisor 1 tc filter add dev eth0 parent ffff: protocol ip prio 100 \ handle 1:0:11 u32 ht 1: link 801: offset at 0 mask 0f00 shift 6 \ plus 0 eat match ip protocol 6 ff tc filter delete dev eth0 parent ffff: protocol ip prio 200 tc filter change dev eth0 parent ffff: protocol ip prio 100 \ handle 1:0:11 u32 ht 1: link 0: offset at 0 mask 0f00 shift 6 plus 0 \ eat match ip protocol 6 ff tc filter delete dev eth0 parent ffff: protocol ip prio 100 Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk> Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jiri Kosina 提交于
Commit 2276f58a ("udp: use a separate rx queue for packet reception") turned static inline __skb_recv_udp() from being a trivial helper around __skb_recv_datagram() into a UDP specific implementaion, making it EXPORT_SYMBOL_GPL() at the same time. There are external modules that got broken by __skb_recv_udp() not being visible to them. Let's unbreak them by making __skb_recv_udp EXPORT_SYMBOL(). Rationale (one of those) why this is actually "technically correct" thing to do: __skb_recv_udp() used to be an inline wrapper around __skb_recv_datagram(), which itself (still, and correctly so, I believe) is EXPORT_SYMBOL(). Cc: Paolo Abeni <pabeni@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Fixes: 2276f58a ("udp: use a separate rx queue for packet reception") Signed-off-by: NJiri Kosina <jkosina@suse.cz> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 06 10月, 2018 5 次提交
-
-
由 Kees Cook 提交于
As done treewide earlier, this catches several more open-coded allocation size calculations that were added to the kernel during the merge window. This performs the following mechanical transformations using Coccinelle: kvmalloc(a * b, ...) -> kvmalloc_array(a, b, ...) kvzalloc(a * b, ...) -> kvcalloc(a, b, ...) devm_kzalloc(..., a * b, ...) -> devm_kcalloc(..., a, b, ...) Signed-off-by: NKees Cook <keescook@chromium.org>
-
由 Wei Wang 提交于
In rawv6_send_hdrinc(), in order to avoid an extra dst_hold(), we directly assign the dst to skb and set passed in dst to NULL to avoid double free. However, in error case, we free skb and then do stats update with the dst pointer passed in. This causes use-after-free on the dst. Fix it by taking rcu read lock right before dst could get released to make sure dst does not get freed until the stats update is done. Note: we don't have this issue in ipv4 cause dst is not used for stats update in v4. Syzkaller reported following crash: BUG: KASAN: use-after-free in rawv6_send_hdrinc net/ipv6/raw.c:692 [inline] BUG: KASAN: use-after-free in rawv6_sendmsg+0x4421/0x4630 net/ipv6/raw.c:921 Read of size 8 at addr ffff8801d95ba730 by task syz-executor0/32088 CPU: 1 PID: 32088 Comm: syz-executor0 Not tainted 4.19.0-rc2+ #93 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113 print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 rawv6_send_hdrinc net/ipv6/raw.c:692 [inline] rawv6_sendmsg+0x4421/0x4630 net/ipv6/raw.c:921 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:631 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2114 __sys_sendmsg+0x11d/0x280 net/socket.c:2152 __do_sys_sendmsg net/socket.c:2161 [inline] __se_sys_sendmsg net/socket.c:2159 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2159 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457099 Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f83756edc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f83756ee6d4 RCX: 0000000000457099 RDX: 0000000000000000 RSI: 0000000020003840 RDI: 0000000000000004 RBP: 00000000009300a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000004d4b30 R14: 00000000004c90b1 R15: 0000000000000000 Allocated by task 32088: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 kmem_cache_alloc+0x12e/0x730 mm/slab.c:3554 dst_alloc+0xbb/0x1d0 net/core/dst.c:105 ip6_dst_alloc+0x35/0xa0 net/ipv6/route.c:353 ip6_rt_cache_alloc+0x247/0x7b0 net/ipv6/route.c:1186 ip6_pol_route+0x8f8/0xd90 net/ipv6/route.c:1895 ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2093 fib6_rule_lookup+0x277/0x860 net/ipv6/fib6_rules.c:122 ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2121 ip6_route_output include/net/ip6_route.h:88 [inline] ip6_dst_lookup_tail+0xe27/0x1d60 net/ipv6/ip6_output.c:951 ip6_dst_lookup_flow+0xc8/0x270 net/ipv6/ip6_output.c:1079 rawv6_sendmsg+0x12d9/0x4630 net/ipv6/raw.c:905 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:631 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2114 __sys_sendmsg+0x11d/0x280 net/socket.c:2152 __do_sys_sendmsg net/socket.c:2161 [inline] __se_sys_sendmsg net/socket.c:2159 [inline] __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2159 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 5356: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kmem_cache_free+0x83/0x290 mm/slab.c:3756 dst_destroy+0x267/0x3c0 net/core/dst.c:141 dst_destroy_rcu+0x16/0x19 net/core/dst.c:154 __rcu_reclaim kernel/rcu/rcu.h:236 [inline] rcu_do_batch kernel/rcu/tree.c:2576 [inline] invoke_rcu_callbacks kernel/rcu/tree.c:2880 [inline] __rcu_process_callbacks kernel/rcu/tree.c:2847 [inline] rcu_process_callbacks+0xf23/0x2670 kernel/rcu/tree.c:2864 __do_softirq+0x30b/0xad8 kernel/softirq.c:292 Fixes: 1789a640 ("raw: avoid two atomics in xmit") Signed-off-by: NWei Wang <weiwan@google.com> Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Ahern 提交于
A number of TC attributes are processed without proper validation (e.g., length checks). Add a tca policy for all input attributes and use when invoking nlmsg_parse. The 2 Fixes tags below cover the latest additions. The other attributes are a string (KIND), nested attribute (OPTIONS which does seem to have validation in most cases), for dumps only or a flag. Fixes: 5bc17018 ("net: sched: introduce multichain support for filters") Fixes: d47a6b0e ("net: sched: introduce ingress/egress block index attributes for qdisc") Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
Currently, rtnl_fdb_dump() assumes the family header is 'struct ifinfomsg', which is not always true -- 'struct ndmsg' is used by iproute2 ('ip neigh'). The problem is, the function bails out early if nlmsg_parse() fails, which does occur for iproute2 usage of 'struct ndmsg' because the payload length is shorter than the family header alone (as 'struct ifinfomsg' is assumed). This breaks backward compatibility with userspace -- nothing is sent back. Some examples with iproute2 and netlink library for go [1]: 1) $ bridge fdb show 33:33:00:00:00:01 dev ens3 self permanent 01:00:5e:00:00:01 dev ens3 self permanent 33:33:ff:15:98:30 dev ens3 self permanent This one works, as it uses 'struct ifinfomsg'. fdb_show() @ iproute2/bridge/fdb.c """ .n.nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg)), ... if (rtnl_dump_request(&rth, RTM_GETNEIGH, [...] """ 2) $ ip --family bridge neigh RTNETLINK answers: Invalid argument Dump terminated This one fails, as it uses 'struct ndmsg'. do_show_or_flush() @ iproute2/ip/ipneigh.c """ .n.nlmsg_type = RTM_GETNEIGH, .n.nlmsg_len = NLMSG_LENGTH(sizeof(struct ndmsg)), """ 3) $ ./neighlist < no output > This one fails, as it uses 'struct ndmsg'-based. neighList() @ netlink/neigh_linux.go """ req := h.newNetlinkRequest(unix.RTM_GETNEIGH, [...] msg := Ndmsg{ """ The actual breakage was introduced by commit 0ff50e83 ("net: rtnetlink: bail out from rtnl_fdb_dump() on parse error"), because nlmsg_parse() fails if the payload length (with the _actual_ family header) is less than the family header length alone (which is assumed, in parameter 'hdrlen'). This is true in the examples above with struct ndmsg, with size and payload length shorter than struct ifinfomsg. However, that commit just intends to fix something under the assumption the family header is indeed an 'struct ifinfomsg' - by preventing access to the payload as such (via 'ifm' pointer) if the payload length is not sufficient to actually contain it. The assumption was introduced by commit 5e6d2435 ("bridge: netlink dump interface at par with brctl"), to support iproute2's 'bridge fdb' command (not 'ip neigh') which indeed uses 'struct ifinfomsg', thus is not broken. So, in order to unbreak the 'struct ndmsg' family headers and still allow 'struct ifinfomsg' to continue to work, check for the known message sizes used with 'struct ndmsg' in iproute2 (with zero or one attribute which is not used in this function anyway) then do not parse the data as ifinfomsg. Same examples with this patch applied (or revert/before the original fix): $ bridge fdb show 33:33:00:00:00:01 dev ens3 self permanent 01:00:5e:00:00:01 dev ens3 self permanent 33:33:ff:15:98:30 dev ens3 self permanent $ ip --family bridge neigh dev ens3 lladdr 33:33:00:00:00:01 PERMANENT dev ens3 lladdr 01:00:5e:00:00:01 PERMANENT dev ens3 lladdr 33:33:ff:15:98:30 PERMANENT $ ./neighlist netlink.Neigh{LinkIndex:2, Family:7, State:128, Type:0, Flags:2, IP:net.IP(nil), HardwareAddr:net.HardwareAddr{0x33, 0x33, 0x0, 0x0, 0x0, 0x1}, LLIPAddr:net.IP(nil), Vlan:0, VNI:0} netlink.Neigh{LinkIndex:2, Family:7, State:128, Type:0, Flags:2, IP:net.IP(nil), HardwareAddr:net.HardwareAddr{0x1, 0x0, 0x5e, 0x0, 0x0, 0x1}, LLIPAddr:net.IP(nil), Vlan:0, VNI:0} netlink.Neigh{LinkIndex:2, Family:7, State:128, Type:0, Flags:2, IP:net.IP(nil), HardwareAddr:net.HardwareAddr{0x33, 0x33, 0xff, 0x15, 0x98, 0x30}, LLIPAddr:net.IP(nil), Vlan:0, VNI:0} Tested on mainline (v4.19-rc6) and net-next (3bd09b05). References: [1] netlink library for go (test-case) https://github.com/vishvananda/netlink $ cat ~/go/src/neighlist/main.go package main import ("fmt"; "syscall"; "github.com/vishvananda/netlink") func main() { neighs, _ := netlink.NeighList(0, syscall.AF_BRIDGE) for _, neigh := range neighs { fmt.Printf("%#v\n", neigh) } } $ export GOPATH=~/go $ go get github.com/vishvananda/netlink $ go build neighlist $ ~/go/src/neighlist/neighlist Thanks to David Ahern for suggestions to improve this patch. Fixes: 0ff50e83 ("net: rtnetlink: bail out from rtnl_fdb_dump() on parse error") Fixes: 5e6d2435 ("bridge: netlink dump interface at par with brctl") Reported-by: NAidan Obley <aobley@pivotal.io> Signed-off-by: NMauricio Faria de Oliveira <mfo@canonical.com> Reviewed-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Shanthosh RK 提交于
Fixes the following Sparse warnings: net/bpfilter/bpfilter_kern.c:62:21: warning: cast removes address space of expression net/bpfilter/bpfilter_kern.c:101:49: warning: Using plain integer as NULL pointer Signed-off-by: NShanthosh RK <shanthosh.rk@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 05 10月, 2018 4 次提交
-
-
由 David Howells 提交于
Fix the rxrpc_data_ready() function to pick up all packets and to not miss any. There are two problems: (1) The sk_data_ready pointer on the UDP socket is set *after* it is bound. This means that it's open for business before we're ready to dequeue packets and there's a tiny window exists in which a packet can sneak onto the receive queue, but we never know about it. Fix this by setting the pointers on the socket prior to binding it. (2) skb_recv_udp() will return an error (such as ENETUNREACH) if there was an error on the transmission side, even though we set the sk_error_report hook. Because rxrpc_data_ready() returns immediately in such a case, it never actually removes its packet from the receive queue. Fix this by abstracting out the UDP dequeuing and checksumming into a separate function that keeps hammering on skb_recv_udp() until it returns -EAGAIN, passing the packets extracted to the remainder of the function. and two potential problems: (3) It might be possible in some circumstances or in the future for packets to be being added to the UDP receive queue whilst rxrpc is running consuming them, so the data_ready() handler might get called less often than once per packet. Allow for this by fully draining the queue on each call as (2). (4) If a packet fails the checksum check, the code currently returns after discarding the packet without checking for more. Allow for this by fully draining the queue on each call as (2). Fixes: 17926a79 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NPaolo Abeni <pabeni@redhat.com>
-
由 David Howells 提交于
Fix some refs to init_net that should've been changed to the appropriate network namespace. Fixes: 2baec2c3 ("rxrpc: Support network namespacing") Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NPaolo Abeni <pabeni@redhat.com>
-
由 Jianfeng Tan 提交于
When we use raw socket as the vhost backend, a packet from virito with gso offloading information, cannot be sent out in later validaton at xmit path, as we did not set correct skb->protocol which is further used for looking up the gso function. To fix this, we set this field according to virito hdr information. Fixes: e858fae2 ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion") Signed-off-by: NJianfeng Tan <jianfeng.tan@linux.alibaba.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Flavio Leitner 提交于
Load the respective NAT helper module if the flow uses it. Signed-off-by: NFlavio Leitner <fbl@redhat.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 04 10月, 2018 1 次提交
-
-
由 Gustavo A. R. Silva 提交于
Replace "fallthru" with a proper "fall through" annotation. This fix is part of the ongoing efforts to enabling -Wimplicit-fallthrough Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 03 10月, 2018 3 次提交
-
-
由 Eric Dumazet 提交于
Caching ip_hdr(skb) before a call to pskb_may_pull() is buggy, do not do it. Fixes: 2efd4fca ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull") Signed-off-by: NEric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Acked-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We have an impressive number of syzkaller bugs that are linked to the fact that syzbot was able to create a networking device with millions of TX (or RX) queues. Let's limit the number of RX/TX queues to 4096, this really should cover all known cases. A separate patch will add various cond_resched() in the loops handling sysfs entries at device creation and dismantle. Tested: lpaa6:~# ip link add gre-4097 numtxqueues 4097 numrxqueues 4097 type ip6gretap RTNETLINK answers: Invalid argument lpaa6:~# time ip link add gre-4096 numtxqueues 4096 numrxqueues 4096 type ip6gretap real 0m0.180s user 0m0.000s sys 0m0.107s Fixes: 76ff5cc9 ("rtnl: allow to specify number of rx and tx queues on device creation") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: Nsyzbot <syzkaller@googlegroups.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Timer handlers do not imply rcu_read_lock(), so my recent fix triggered a LOCKDEP warning when SYNACK is retransmit. Lets add rcu_read_lock()/rcu_read_unlock() pairs around ireq->ireq_opt usages instead of guessing what is done by callers, since it is not worth the pain. Get rid of ireq_opt_deref() helper since it hides the logic without real benefit, since it is now a standard rcu_dereference(). Fixes: 1ad98e9d ("tcp/dccp: fix lockdep issue when SYN is backlogged") Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NWillem de Bruijn <willemb@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-