提交 · 1d870162418a826905161d2276c912986d3b9d9a · openeuler / Kernel

27 2月, 2018 2 次提交

mac80211: support fast-rx with incompatible PS capabilities when PS is disabled · 1d870162

由 Felix Fietkau 提交于 2月 23, 2018

When powersave is disabled for the interface, we can do fast-rx anyway.
Signed-off-by: NFelix Fietkau <nbd@nbd.name>
[fixed indentation on one line]
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

1d870162

mac80211: support AP 4-addr mode fast-rx · 59cae5b9

由 Felix Fietkau 提交于 2月 23, 2018

Signed-off-by: NFelix Fietkau <nbd@nbd.name>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

59cae5b9

24 2月, 2018 3 次提交

net: fib_rules: Add new attribute to set protocol · 1b71af60

由 Donald Sharp 提交于 2月 23, 2018

For ages iproute2 has used `struct rtmsg` as the ancillary header for
FIB rules and in the process set the protocol value to RTPROT_BOOT.
Until ca56209a66 ("net: Allow a rule to track originating protocol")
the kernel rules code ignored the protocol value sent from userspace
and always returned 0 in notifications. To avoid incompatibility with
existing iproute2, send the protocol as a new attribute.

Fixes: cac56209 ("net: Allow a rule to track originating protocol")
Signed-off-by: NDonald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b71af60

net_sched: gen_estimator: fix broken estimators based on percpu stats · a5f7add3

由 Eric Dumazet 提交于 2月 22, 2018

pfifo_fast got percpu stats lately, uncovering a bug I introduced last
year in linux-4.10.

I missed the fact that we have to clear our temporary storage
before calling __gnet_stats_copy_basic() in the case of percpu stats.

Without this fix, rate estimators (tc qd replace dev xxx root est 1sec
4sec pfifo_fast) are utterly broken.

Fixes: 1c0d32fd ("net_sched: gen_estimator: complete rewrite of rate estimators")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a5f7add3

rds: rds_msg_zcopy should return error of null rm->data.op_mmp_znotifier · 79a5b972

由 Sowmini Varadhan 提交于 2月 22, 2018

if either or both of MSG_ZEROCOPY and SOCK_ZEROCOPY have not been
specified, the rm->data.op_mmp_znotifier allocation will be skipped.
In this case, it is invalid ot pass down a cmsghdr with
RDS_CMSG_ZCOPY_COOKIE, so return EINVAL from rds_msg_zcopy for this
case.

Reported-by: syzbot+f893ae7bb2f6456dfbc3@syzkaller.appspotmail.com
Fixes: 0cebacce ("rds: zerocopy Tx support.")
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

79a5b972

23 2月, 2018 8 次提交

ipv6 sit: work around bogus gcc-8 -Wrestrict warning · ca79bec2

由 Arnd Bergmann 提交于 2月 22, 2018

gcc-8 has a new warning that detects overlapping input and output arguments
in memcpy(). It triggers for sit_init_net() calling ipip6_tunnel_clone_6rd(),
which is actually correct:

net/ipv6/sit.c: In function 'sit_init_net':
net/ipv6/sit.c:192:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]

The problem here is that the logic detecting the memcpy() arguments finds them
to be the same, but the conditional that tests for the input and output of
ipip6_tunnel_clone_6rd() to be identical is not a compile-time constant.

We know that netdev_priv(t->dev) is the same as t for a tunnel device,
and comparing "dev" directly here lets the compiler figure out as well
that 'dev == sitn->fb_tunnel_dev' when called from sit_init_net(), so
it no longer warns.

This code is old, so Cc stable to make sure that we don't get the warning
for older kernels built with new gcc.

Cc: Martin Sebor <msebor@gmail.com>
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83456Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ca79bec2

rxrpc: Fix send in rxrpc_send_data_packet() · 93c62c45

由 David Howells 提交于 2月 22, 2018

All the kernel_sendmsg() calls in rxrpc_send_data_packet() need to send
both parts of the iov[] buffer, but one of them does not.  Fix it so that
it does.

Without this, short IPv6 rxrpc DATA packets may be seen that have the rxrpc
header included, but no payload.

Fixes: 5a924b89 ("rxrpc: Don't store the rxrpc header in the Tx queue sk_buffs")
Reported-by: NMarc Dionne <marc.dionne@auristor.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

93c62c45

mac80211: Call mgd_prep_tx before transmitting deauthentication · 94ba9271

由 Ilan Peer 提交于 2月 19, 2018

In multi channel scenarios, when disassociating from the AP before a
beacon was heard from the AP, it is not guaranteed that the virtual
interface is granted air time for the transmission of the
deauthentication frame. This in turn can lead to various issues as
the AP might never get the deauthentication frame.

To mitigate such possible issues, add a HW flag indicating that the
driver requires mac80211 to call the mgd_prep_tx() driver callback
to make sure that the virtual interface is granted immediate airtime
to be able to transmit the frame, in case that no beacon was heard
from the AP.
Signed-off-by: NIlan Peer <ilan.peer@intel.com>
Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

94ba9271

mac80211: add get TID helper · a1f2ba04

由 Sara Sharon 提交于 2月 19, 2018

Extracting the TID from the QOS header is common enough
to justify helper.
Signed-off-by: NSara Sharon <sara.sharon@intel.com>
Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

a1f2ba04

mac80211: support reporting A-MPDU EOF bit value/known · 7299d6f7

由 Johannes Berg 提交于 2月 19, 2018

Support getting the EOF bit value reported from hardware
and writing it out to radiotap.
Signed-off-by: NJohannes Berg <johannes.berg@intel.com>

7299d6f7

net: ipv4: Set addr_type in hash_keys for forwarded case · 1fe4b118

由 David Ahern 提交于 2月 21, 2018

The result of the skb flow dissect is copied from keys to hash_keys to
ensure only the intended data is hashed. The original L4 hash patch
overlooked setting the addr_type for this case; add it.

Fixes: bf4e0a3d ("net: ipv4: add support for ECMP hash policy choice")
Reported-by: NIdo Schimmel <idosch@idosch.org>
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1fe4b118

tcp_bbr: better deal with suboptimal GSO · 350c9f48

由 Eric Dumazet 提交于 2月 21, 2018

BBR uses tcp_tso_autosize() in an attempt to probe what would be the
burst sizes and to adjust cwnd in bbr_target_cwnd() with following
gold formula :

/* Allow enough full-sized skbs in flight to utilize end systems. */
cwnd += 3 * bbr->tso_segs_goal;

But GSO can be lacking or be constrained to very small
units (ip link set dev ... gso_max_segs 2)

What we really want is to have enough packets in flight so that both
GSO and GRO are efficient.

So in the case GSO is off or downgraded, we still want to have the same
number of packets in flight as if GSO/TSO was fully operational, so
that GRO can hopefully be working efficiently.

To fix this issue, we make tcp_tso_autosize() unaware of
sk->sk_gso_max_segs

Only tcp_tso_segs() has to enforce the gso_max_segs limit.

Tested:

ethtool -K eth0 tso off gso off
tc qd replace dev eth0 root pfifo_fast

Before patch:
for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
    691  (ss -temoi shows cwnd is stuck around 6 )
    667
    651
    631
    517

After patch :
# for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
   1733 (ss -temoi shows cwnd is around 386 )
   1778
   1746
   1781
   1718

Fixes: 0f8782ea ("tcp_bbr: add BBR congestion control")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NOleksandr Natalenko <oleksandr@natalenko.name>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

350c9f48

netlink: put module reference if dump start fails · b87b6194

由 Jason A. Donenfeld 提交于 2月 21, 2018

Before, if cb->start() failed, the module reference would never be put,
because cb->cb_running is intentionally false at this point. Users are
generally annoyed by this because they can no longer unload modules that
leak references. Also, it may be possible to tediously wrap a reference
counter back to zero, especially since module.c still uses atomic_inc
instead of refcount_inc.

This patch expands the error path to simply call module_put if
cb->start() fails.

Fixes: 41c87425 ("netlink: do not set cb_running if dump's start() errs")
Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b87b6194

22 2月, 2018 10 次提交

bpf: clean up unused-variable warning · a7dcdf6e

由 Arnd Bergmann 提交于 2月 20, 2018

The only user of this variable is inside of an #ifdef, causing
a warning without CONFIG_INET:

net/core/filter.c: In function '____bpf_sock_ops_cb_flags_set':
net/core/filter.c:3382:6: error: unused variable 'val' [-Werror=unused-variable]
  int val = argval & BPF_SOCK_OPS_ALL_CB_FLAGS;

This replaces the #ifdef with a nicer IS_ENABLED() check that
makes the code more readable and avoids the warning.

Fixes: b13d8807 ("bpf: Adds field bpf_sock_ops_cb_flags to tcp_sock")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>

a7dcdf6e

net: Allow a rule to track originating protocol · cac56209

由 Donald Sharp 提交于 2月 20, 2018

Allow a rule that is being added/deleted/modified or
dumped to contain the originating protocol's id.

The protocol is handled just like a routes originating
protocol is.  This is especially useful because there
is starting to be a plethora of different user space
programs adding rules.

Allow the vrf device to specify that the kernel is the originator
of the rule created for this device.
Signed-off-by: NDonald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cac56209

tcp: remove dead code after CHECKSUM_PARTIAL adoption · 98be9b12

由 Eric Dumazet 提交于 2月 19, 2018

Since all skbs in write/rtx queues have CHECKSUM_PARTIAL,
we can remove dead code.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

98be9b12

tcp: remove dead code from tcp_set_skb_tso_segs() · 4a64fd6c

由 Eric Dumazet 提交于 2月 19, 2018

We no longer have skbs with skb->ip_summed == CHECKSUM_NONE
in TCP write queues.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a64fd6c

tcp: tcp_sendmsg() only deals with CHECKSUM_PARTIAL · 65ec6097

由 Eric Dumazet 提交于 2月 19, 2018

We no longer have skbs with skb->ip_summed == CHECKSUM_NONE
in TCP write queues.

We can remove dead code in tcp_sendmsg().
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65ec6097

tcp: remove sk_check_csum_caps() · dead7cdb

由 Eric Dumazet 提交于 2月 19, 2018

Since TCP relies on GSO, we do not need this helper anymore.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dead7cdb

tcp: remove sk_can_gso() use · 74d4a8f8

由 Eric Dumazet 提交于 2月 19, 2018

After previous commit, sk_can_gso() is always true.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

74d4a8f8

tcp: switch to GSO being always on · 0a6b2a1d

由 Eric Dumazet 提交于 2月 19, 2018

Oleksandr Natalenko reported performance issues with BBR without FQ
packet scheduler that were root caused to lack of SG and GSO/TSO on
his configuration.

In this mode, TCP internal pacing has to setup a high resolution timer
for each MSS sent.

We could implement in TCP a strategy similar to the one adopted
in commit fefa569a ("net_sched: sch_fq: account for schedule/timers drifts")
or decide to finally switch TCP stack to a GSO only mode.

This has many benefits :

1) Most TCP developments are done with TSO in mind.
2) Less high-resolution timers needs to be armed for TCP-pacing
3) GSO can benefit of xmit_more hint
4) Receiver GRO is more effective (as if TSO was used for real on sender)
   -> Lower ACK traffic
5) Write queues have less overhead (one skb holds about 64KB of payload)
6) SACK coalescing just works.
7) rtx rb-tree contains less packets, SACK is cheaper.

This patch implements the minimum patch, but we can remove some legacy
code as follow ups.

Tested:

On 40Gbit link, one netperf -t TCP_STREAM

BBR+fq:
sg on:  26 Gbits/sec
sg off: 15.7 Gbits/sec   (was 2.3 Gbit before patch)

BBR+pfifo_fast:
sg on:  24.2 Gbits/sec
sg off: 14.9 Gbits/sec  (was 0.66 Gbit before patch !!! )

BBR+fq_codel:
sg on:  24.4 Gbits/sec
sg off: 15 Gbits/sec  (was 0.66 Gbit before patch !!! )
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NOleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0a6b2a1d

rds: send: mark expected switch fall-through in rds_rm_size · f9053113

由 Gustavo A. R. Silva 提交于 2月 19, 2018

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 1465362 ("Missing break in switch")
Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f9053113

net: sched: add em_ipt ematch for calling xtables matches · ccc007e4

由 Eyal Birger 提交于 2月 15, 2018

The commit a new tc ematch for using netfilter xtable matches.

This allows early classification as well as mirroning/redirecting traffic
based on logic implemented in netfilter extensions.

Current supported use case is classification based on the incoming IPSec
state used during decpsulation using the 'policy' iptables extension
(xt_policy).

The module dynamically fetches the netfilter match module and calls
it using a fake xt_action_param structure based on validated userspace
provided parameters.

As the xt_policy match does not access skb->data, no skb modifications
are needed on match.
Signed-off-by: NEyal Birger <eyal.birger@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ccc007e4

21 2月, 2018 6 次提交

net: sched: report if filter is too large to dump · 5ae437ad

由 Roman Kapl 提交于 2月 19, 2018

So far, if the filter was too large to fit in the allocated skb, the
kernel did not return any error and stopped dumping. Modify the dumper
so that it returns -EMSGSIZE when a filter fails to dump and it is the
first filter in the skb. If we are not first, we will get a next chance
with more room.

I understand this is pretty near to being an API change, but the
original design (silent truncation) can be considered a bug.

Note: The error case can happen pretty easily if you create a filter
with 32 actions and have 4kb pages. Also recent versions of iproute try
to be clever with their buffer allocation size, which in turn leads to
Signed-off-by: NRoman Kapl <code@rkapl.cz>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ae437ad

x25: use %*ph to print small buffer · 3fef2b62

由 Antonio Cardace 提交于 2月 19, 2018

Use %*ph format to print small buffer as hex string.
Suggested-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NAntonio Cardace <anto.cardace@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3fef2b62

devlink: Move size validation to core · cc944ead

由 Arkadi Sharshevsky 提交于 2月 20, 2018

Currently the size validation is done via a cb, which is unneeded. The
size validation can be moved to core. The next patch will perform cleanup.
Signed-off-by: NArkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cc944ead

net: Queue net_cleanup_work only if there is first net added · 8349efd9

由 Kirill Tkhai 提交于 2月 19, 2018

When llist_add() returns false, cleanup_net() hasn't made its
llist_del_all(), while the work has already been scheduled
by the first queuer. So, we may skip queue_work() in this case.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8349efd9

net: Make cleanup_list and net::cleanup_list of llist type · 65b7b5b9

由 Kirill Tkhai 提交于 2月 19, 2018

This simplifies cleanup queueing and makes cleanup lists
to use llist primitives. Since llist has its own cmpxchg()
ordering, cleanup_list_lock is not more need.

Also, struct llist_node is smaller, than struct list_head,
so we save some bytes in struct net with this patch.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

65b7b5b9

net: Kill net_mutex · 19efbd93

由 Kirill Tkhai 提交于 2月 19, 2018

We take net_mutex, when there are !async pernet_operations
registered, and read locking of net_sem is not enough. But
we may get rid of taking the mutex, and just change the logic
to write lock net_sem in such cases. This obviously reduces
the number of lock operations, we do.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

19efbd93

20 2月, 2018 11 次提交

tipc: don't call sock_release() in atomic context · 26736a08

由 Paolo Abeni 提交于 2月 19, 2018

syzbot reported a scheduling while atomic issue at netns
destruction time:

BUG: sleeping function called from invalid context at net/core/sock.c:2769
in_atomic(): 1, irqs_disabled(): 0, pid: 85, name: kworker/u4:3
5 locks held by kworker/u4:3/85:
  #0:  ((wq_completion)"%s""netns"){+.+.}, at: [<00000000c9792deb>]
process_one_work+0xaaf/0x1af0 kernel/workqueue.c:2084
  #1:  (net_cleanup_work){+.+.}, at: [<00000000adc12e2a>]
process_one_work+0xb01/0x1af0 kernel/workqueue.c:2088
  #2:  (net_sem){++++}, at: [<000000009ccb5669>] cleanup_net+0x23f/0xd20
net/core/net_namespace.c:494
  #3:  (net_mutex){+.+.}, at: [<00000000a92767d9>] cleanup_net+0xa7d/0xd20
net/core/net_namespace.c:496
  #4:  (&(&srv->idr_lock)->rlock){+...}, at: [<000000001343e568>]
spin_lock_bh include/linux/spinlock.h:315 [inline]
  #4:  (&(&srv->idr_lock)->rlock){+...}, at: [<000000001343e568>]
tipc_topsrv_stop+0x231/0x610 net/tipc/topsrv.c:685
CPU: 0 PID: 85 Comm: kworker/u4:3 Not tainted 4.16.0-rc1+ #230
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x257 lib/dump_stack.c:53
  ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6128
  __might_sleep+0x95/0x190 kernel/sched/core.c:6081
  lock_sock_nested+0x37/0x110 net/core/sock.c:2769
  lock_sock include/net/sock.h:1463 [inline]
  tipc_release+0x103/0xff0 net/tipc/socket.c:572
  sock_release+0x8d/0x1e0 net/socket.c:594
  tipc_topsrv_stop+0x3c0/0x610 net/tipc/topsrv.c:696
  tipc_exit_net+0x15/0x40 net/tipc/core.c:96
  ops_exit_list.isra.6+0xae/0x150 net/core/net_namespace.c:148
  cleanup_net+0x6ba/0xd20 net/core/net_namespace.c:529
  process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
  kthread+0x33c/0x400 kernel/kthread.c:238
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:429

This is caused by tipc_topsrv_stop() releasing the listener socket
with the idr lock held. This changeset addresses the issue moving
the release operation outside such lock.

Reported-and-tested-by: syzbot+749d9d87c294c00ca856@syzkaller.appspotmail.com
Fixes: 0ef897be ("tipc: separate topology server listener socket from subcsriber sockets")
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Acked-by:  ///jon
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

26736a08

tipc: fix bug on error path in tipc_topsrv_kern_subscr() · 96c252bf

由 Jon Maloy 提交于 2月 19, 2018

In commit cc1ea9ffadf7 ("tipc: eliminate struct tipc_subscriber") we
re-introduced an old bug on the error path in the function
tipc_topsrv_kern_subscr(). We now re-introduce the correction too.

Reported-by: syzbot+f62e0f2a0ef578703946@syzkaller.appspotmail.com
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

96c252bf

net: Convert iptable_filter_net_ops · da349fad

由 Kirill Tkhai 提交于 2月 19, 2018

These pernet_operations register and unregister
net::ipv4.iptable_filter table. Since there are
no packets in-flight at the time of exit method
is working, iptables rules should not be touched.
Also, pernet_operations should not send ipv4
packets each other. So, it's safe to mark them
async.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da349fad

net: Convert ip_tables_net_ops, udplite6_net_ops and xt_net_ops · 4d6b8076

由 Kirill Tkhai 提交于 2月 19, 2018

ip_tables_net_ops and udplite6_net_ops create and destroy /proc entries.
xt_net_ops does nothing.

So, we are able to mark them async.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4d6b8076

net: Convert ip6_frags_ops · 5fc094f5

由 Kirill Tkhai 提交于 2月 19, 2018

Exit methods calls inet_frags_exit_net() with global ip6_frags
as argument. So, after we make the pernet_operations async,
a pair of exit methods may be called to iterate this hash table.
Since there is inet_frag_worker(), which already may work
in parallel with inet_frags_exit_net(), and it can make the same
cleanup, that inet_frags_exit_net() does, it's safe. So we may
mark these pernet_operations as async.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5fc094f5

net: Convert fib6_net_ops, ipv6_addr_label_ops and ip6_segments_ops · d16784d9

由 Kirill Tkhai 提交于 2月 19, 2018

These pernet_operations register and unregister tables
and lists for packets forwarding. All of the entities
are per-net. Init methods makes simple initializations,
and since net is not visible for foreigners at the time
it is working, it can't race with anything. Exit method
is executed when there are only local devices, and there
mustn't be packets in-flight. Also, it looks like no one
pernet_operations want to send ipv6 packets to foreign
net. The same reasons are for ipv6_addr_label_ops and
ip6_segments_ops. So, we are able to mark all them as
async.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d16784d9

net: Convert xfrm6_net_ops · b4891413

由 Kirill Tkhai 提交于 2月 19, 2018

These pernet_operations create sysctl tables and
initialize net::xfrm.xfrm6_dst_ops used for routing.
It doesn't look like another pernet_operations send
ipv6 packets to foreign net namespaces, so it should
be safe to mark the pernet_operations as async.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b4891413

net: Convert ip6_flowlabel_net_ops · a7852a76

由 Kirill Tkhai 提交于 2月 19, 2018

These pernet_operations create and destroy /proc entries.
ip6_fl_purge() makes almost the same actions as timer
ip6_fl_gc_timer does, and as it can be executed in parallel
with ip6_fl_purge(), two parallel ip6_fl_purge() may be
executed. So, we can mark it async.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7852a76

net: Convert ping_v6_net_ops · ac34cb6c

由 Kirill Tkhai 提交于 2月 19, 2018

These pernet_operations only register and unregister /proc
entries, so it's possible to mark them async.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ac34cb6c

net: Convert ipv6_sysctl_net_ops · 58708cae

由 Kirill Tkhai 提交于 2月 19, 2018

These pernet_operations create and destroy sysctl tables.
They are not touched by another net pernet_operations.
So, it's possible to execute them in parallel with others.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

58708cae

net: Convert tcpv6_net_ops · fef65a2c

由 Kirill Tkhai 提交于 2月 19, 2018

These pernet_operations create and destroy net::ipv6.tcp_sk
socket, which is used in tcp_v6_send_response() only. It looks
like foreign pernet_operations don't want to set ipv6 connection
inside destroyed net, so this socket may be created in destroyed
in parallel with anything else. inet_twsk_purge() is also safe
for that, as described in patch for tcp_sk_ops. So, it's possible
to mark them as async.
Signed-off-by: NKirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fef65a2c

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功