- 22 2月, 2018 8 次提交
-
-
由 Eric Dumazet 提交于
Since all skbs in write/rtx queues have CHECKSUM_PARTIAL, we can remove dead code. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We no longer have skbs with skb->ip_summed == CHECKSUM_NONE in TCP write queues. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
We no longer have skbs with skb->ip_summed == CHECKSUM_NONE in TCP write queues. We can remove dead code in tcp_sendmsg(). Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Since TCP relies on GSO, we do not need this helper anymore. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
After previous commit, sk_can_gso() is always true. Signed-off-by: NEric Dumazet <edumazet@google.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eric Dumazet 提交于
Oleksandr Natalenko reported performance issues with BBR without FQ packet scheduler that were root caused to lack of SG and GSO/TSO on his configuration. In this mode, TCP internal pacing has to setup a high resolution timer for each MSS sent. We could implement in TCP a strategy similar to the one adopted in commit fefa569a ("net_sched: sch_fq: account for schedule/timers drifts") or decide to finally switch TCP stack to a GSO only mode. This has many benefits : 1) Most TCP developments are done with TSO in mind. 2) Less high-resolution timers needs to be armed for TCP-pacing 3) GSO can benefit of xmit_more hint 4) Receiver GRO is more effective (as if TSO was used for real on sender) -> Lower ACK traffic 5) Write queues have less overhead (one skb holds about 64KB of payload) 6) SACK coalescing just works. 7) rtx rb-tree contains less packets, SACK is cheaper. This patch implements the minimum patch, but we can remove some legacy code as follow ups. Tested: On 40Gbit link, one netperf -t TCP_STREAM BBR+fq: sg on: 26 Gbits/sec sg off: 15.7 Gbits/sec (was 2.3 Gbit before patch) BBR+pfifo_fast: sg on: 24.2 Gbits/sec sg off: 14.9 Gbits/sec (was 0.66 Gbit before patch !!! ) BBR+fq_codel: sg on: 24.4 Gbits/sec sg off: 15 Gbits/sec (was 0.66 Gbit before patch !!! ) Signed-off-by: NEric Dumazet <edumazet@google.com> Reported-by: NOleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Gustavo A. R. Silva 提交于
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 1465362 ("Missing break in switch") Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Eyal Birger 提交于
The commit a new tc ematch for using netfilter xtable matches. This allows early classification as well as mirroning/redirecting traffic based on logic implemented in netfilter extensions. Current supported use case is classification based on the incoming IPSec state used during decpsulation using the 'policy' iptables extension (xt_policy). The module dynamically fetches the netfilter match module and calls it using a fake xt_action_param structure based on validated userspace provided parameters. As the xt_policy match does not access skb->data, no skb modifications are needed on match. Signed-off-by: NEyal Birger <eyal.birger@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 21 2月, 2018 5 次提交
-
-
由 Antonio Cardace 提交于
Use %*ph format to print small buffer as hex string. Suggested-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: NAntonio Cardace <anto.cardace@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Arkadi Sharshevsky 提交于
Currently the size validation is done via a cb, which is unneeded. The size validation can be moved to core. The next patch will perform cleanup. Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: NJiri Pirko <jiri@mellanox.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
When llist_add() returns false, cleanup_net() hasn't made its llist_del_all(), while the work has already been scheduled by the first queuer. So, we may skip queue_work() in this case. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
This simplifies cleanup queueing and makes cleanup lists to use llist primitives. Since llist has its own cmpxchg() ordering, cleanup_list_lock is not more need. Also, struct llist_node is smaller, than struct list_head, so we save some bytes in struct net with this patch. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
We take net_mutex, when there are !async pernet_operations registered, and read locking of net_sem is not enough. But we may get rid of taking the mutex, and just change the logic to write lock net_sem in such cases. This obviously reduces the number of lock operations, we do. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 20 2月, 2018 18 次提交
-
-
由 Paolo Abeni 提交于
syzbot reported a scheduling while atomic issue at netns destruction time: BUG: sleeping function called from invalid context at net/core/sock.c:2769 in_atomic(): 1, irqs_disabled(): 0, pid: 85, name: kworker/u4:3 5 locks held by kworker/u4:3/85: #0: ((wq_completion)"%s""netns"){+.+.}, at: [<00000000c9792deb>] process_one_work+0xaaf/0x1af0 kernel/workqueue.c:2084 #1: (net_cleanup_work){+.+.}, at: [<00000000adc12e2a>] process_one_work+0xb01/0x1af0 kernel/workqueue.c:2088 #2: (net_sem){++++}, at: [<000000009ccb5669>] cleanup_net+0x23f/0xd20 net/core/net_namespace.c:494 #3: (net_mutex){+.+.}, at: [<00000000a92767d9>] cleanup_net+0xa7d/0xd20 net/core/net_namespace.c:496 #4: (&(&srv->idr_lock)->rlock){+...}, at: [<000000001343e568>] spin_lock_bh include/linux/spinlock.h:315 [inline] #4: (&(&srv->idr_lock)->rlock){+...}, at: [<000000001343e568>] tipc_topsrv_stop+0x231/0x610 net/tipc/topsrv.c:685 CPU: 0 PID: 85 Comm: kworker/u4:3 Not tainted 4.16.0-rc1+ #230 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6128 __might_sleep+0x95/0x190 kernel/sched/core.c:6081 lock_sock_nested+0x37/0x110 net/core/sock.c:2769 lock_sock include/net/sock.h:1463 [inline] tipc_release+0x103/0xff0 net/tipc/socket.c:572 sock_release+0x8d/0x1e0 net/socket.c:594 tipc_topsrv_stop+0x3c0/0x610 net/tipc/topsrv.c:696 tipc_exit_net+0x15/0x40 net/tipc/core.c:96 ops_exit_list.isra.6+0xae/0x150 net/core/net_namespace.c:148 cleanup_net+0x6ba/0xd20 net/core/net_namespace.c:529 process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113 worker_thread+0x223/0x1990 kernel/workqueue.c:2247 kthread+0x33c/0x400 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:429 This is caused by tipc_topsrv_stop() releasing the listener socket with the idr lock held. This changeset addresses the issue moving the release operation outside such lock. Reported-and-tested-by: syzbot+749d9d87c294c00ca856@syzkaller.appspotmail.com Fixes: 0ef897be ("tipc: separate topology server listener socket from subcsriber sockets") Signed-off-by: NPaolo Abeni <pabeni@redhat.com> Acked-by: ///jon Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Jon Maloy 提交于
In commit cc1ea9ffadf7 ("tipc: eliminate struct tipc_subscriber") we re-introduced an old bug on the error path in the function tipc_topsrv_kern_subscr(). We now re-introduce the correction too. Reported-by: syzbot+f62e0f2a0ef578703946@syzkaller.appspotmail.com Signed-off-by: NJon Maloy <jon.maloy@ericsson.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
These pernet_operations register and unregister net::ipv4.iptable_filter table. Since there are no packets in-flight at the time of exit method is working, iptables rules should not be touched. Also, pernet_operations should not send ipv4 packets each other. So, it's safe to mark them async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
ip_tables_net_ops and udplite6_net_ops create and destroy /proc entries. xt_net_ops does nothing. So, we are able to mark them async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
Exit methods calls inet_frags_exit_net() with global ip6_frags as argument. So, after we make the pernet_operations async, a pair of exit methods may be called to iterate this hash table. Since there is inet_frag_worker(), which already may work in parallel with inet_frags_exit_net(), and it can make the same cleanup, that inet_frags_exit_net() does, it's safe. So we may mark these pernet_operations as async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
These pernet_operations register and unregister tables and lists for packets forwarding. All of the entities are per-net. Init methods makes simple initializations, and since net is not visible for foreigners at the time it is working, it can't race with anything. Exit method is executed when there are only local devices, and there mustn't be packets in-flight. Also, it looks like no one pernet_operations want to send ipv6 packets to foreign net. The same reasons are for ipv6_addr_label_ops and ip6_segments_ops. So, we are able to mark all them as async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
These pernet_operations create sysctl tables and initialize net::xfrm.xfrm6_dst_ops used for routing. It doesn't look like another pernet_operations send ipv6 packets to foreign net namespaces, so it should be safe to mark the pernet_operations as async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
These pernet_operations create and destroy /proc entries. ip6_fl_purge() makes almost the same actions as timer ip6_fl_gc_timer does, and as it can be executed in parallel with ip6_fl_purge(), two parallel ip6_fl_purge() may be executed. So, we can mark it async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
These pernet_operations only register and unregister /proc entries, so it's possible to mark them async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
These pernet_operations create and destroy sysctl tables. They are not touched by another net pernet_operations. So, it's possible to execute them in parallel with others. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
These pernet_operations create and destroy net::ipv6.tcp_sk socket, which is used in tcp_v6_send_response() only. It looks like foreign pernet_operations don't want to set ipv6 connection inside destroyed net, so this socket may be created in destroyed in parallel with anything else. inet_twsk_purge() is also safe for that, as described in patch for tcp_sk_ops. So, it's possible to mark them as async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
These pernet_operations register and unregister net::ipv6.fib6_rules_ops, which are used for routing. It looks like there are no pernet_operations, which send ipv6 packages to another net, so we are able to mark them as async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
net->ipv6.peers is dereferenced in three places via inet_getpeer_v6(), and it's used to handle skb. All the users of inet_getpeer_v6() do not look like be able to be called from foreign net pernet_operations, so we may mark them as async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
net: Convert raw6_net_ops, udplite6_net_ops, ipv6_proc_ops, if6_proc_net_ops and ip6_route_net_late_ops These pernet_operations create and destroy /proc entries and safely may be converted and safely may be mark as async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
These pernet_operations create and destroy net::ipv6.icmp_sk socket, used to send ICMP or error reply. Nobody can dereference the socket to handle a packet before net is initialized, as there is no routing; nobody can do that in parallel with exit, as all of devices are moved to init_net or destroyed and there are no packets it-flight. So, it's possible to mark these pernet_operations as async. The same for ndisc_net_ops and for igmp6_net_ops. The last one also creates and destroys /proc entries. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
These pernet_operations create and destroy /proc entries, populate and depopulate net::rules_ops and multiroute table. All the structures are pernet, and they are not touched by foreign net pernet_operations. So, it's possible to mark them async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
This patch finishes converting pernet_operations registered in net/wireless directory. These pernet_operations have only exit method, which moves devices to init_net. This action is not pernet_operations-specific, and function cfg80211_switch_netns() may be called all time during the system life. All necessary protection against concurrent cfg80211_pernet_exit() is made by rtnl_lock(). So, cfg80211_pernet_ops is able to be marked as async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Kirill Tkhai 提交于
init method initializes sysctl defaults, allocates percpu arrays and creates /proc entries. exit method reverts the above. There are no pernet_operations, which are interested in the above entities of foreign net namespace, so inet6_net_ops are able to be marked as async. Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 17 2月, 2018 9 次提交
-
-
由 David Ahern 提交于
Only allow ifindex from IP_PKTINFO to override SO_BINDTODEVICE settings if the index is actually set in the message. Signed-off-by: NDavid Ahern <dsahern@gmail.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Xin Long 提交于
Commit fb234035 ("sctp: remove the useless check in sctp_renege_events") forgot to remove another check for chunk in sctp_renege_events. Dan found this when doing a static check. This patch is to remove that check, and also to merge two checks into one 'if statement'. Fixes: fb234035 ("sctp: remove the useless check in sctp_renege_events") Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NXin Long <lucien.xin@gmail.com> Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: NNeil Horman <nhorman@tuxdriver.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 David Howells 提交于
Due to a check recently added to copy_to_user(), it's now not permitted to copy from slab-held data to userspace unless the slab is whitelisted. This affects rxrpc_recvmsg() when it attempts to place an RXRPC_USER_CALL_ID control message in the userspace control message buffer. A warning is generated by usercopy_warn() because the source is the copy of the user_call_ID retained in the rxrpc_call struct. Work around the issue by copying the user_call_ID to a variable on the stack and passing that to put_cmsg(). The warning generated looks like: Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLUB object 'dmaengine-unmap-128' (offset 680, size 8)! WARNING: CPU: 0 PID: 1401 at mm/usercopy.c:81 usercopy_warn+0x7e/0xa0 ... RIP: 0010:usercopy_warn+0x7e/0xa0 ... Call Trace: __check_object_size+0x9c/0x1a0 put_cmsg+0x98/0x120 rxrpc_recvmsg+0x6fc/0x1010 [rxrpc] ? finish_wait+0x80/0x80 ___sys_recvmsg+0xf8/0x240 ? __clear_rsb+0x25/0x3d ? __clear_rsb+0x15/0x3d ? __clear_rsb+0x25/0x3d ? __clear_rsb+0x15/0x3d ? __clear_rsb+0x25/0x3d ? __clear_rsb+0x15/0x3d ? __clear_rsb+0x25/0x3d ? __clear_rsb+0x15/0x3d ? finish_task_switch+0xa6/0x2b0 ? trace_hardirqs_on_caller+0xed/0x180 ? _raw_spin_unlock_irq+0x29/0x40 ? __sys_recvmsg+0x4e/0x90 __sys_recvmsg+0x4e/0x90 do_syscall_64+0x7a/0x220 entry_SYSCALL_64_after_hwframe+0x26/0x9b Reported-by: NJonathan Billings <jsbillings@jsbillings.org> Signed-off-by: NDavid Howells <dhowells@redhat.com> Acked-by: NKees Cook <keescook@chromium.org> Tested-by: NJonathan Billings <jsbillings@jsbillings.org> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds extack support for TC mirred action. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds extack handling for a common used TC act function "tcf_generic_walker()" to add an extack message on failures. The tcf_generic_walker() function can fail if get a invalid command different than DEL and GET. The naming "action" here is wrong, the correct naming would be command. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds extack support for act walker callback api. This prepares to handle extack support inside each specific act implementation. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds extack support for act lookup callback api. This prepares to handle extack support inside each specific act implementation. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds extack support for act init callback api. This prepares to handle extack support inside each specific act implementation. Based on work by David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
由 Alexander Aring 提交于
This patch adds extack support for generic act handling. The extack will be set deeper to each called function which is not part of netdev core api. Based on work by David Ahern <dsahern@gmail.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: NAlexander Aring <aring@mojatatu.com> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-