提交 · 504a40113cc4f329dd75bbf6e4b060603224d814 · openeuler / Kernel

31 3月, 2021 1 次提交

ipv6: add ipv6_dev_find to stubs · 504a4011

由 Andreas Roeseler 提交于 3月 29, 2021

Add ipv6_dev_find to ipv6_stub to allow lookup of net_devices by IPV6
address in net/ipv4/icmp.c.
Signed-off-by: NAndreas Roeseler <andreas.a.roeseler@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

504a4011

27 3月, 2021 1 次提交

mld: convert ifmcaddr6 to RCU · 88e2ca30

由 Taehee Yoo 提交于 3月 25, 2021

The ifmcaddr6 has been protected by inet6_dev->lock(rwlock) so that
the critical section is atomic context. In order to switch this context,
changing locking is needed. The ifmcaddr6 actually already protected by
RTNL So if it's converted to use RCU, its control path context can be
switched to sleepable.
Suggested-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

88e2ca30

24 2月, 2021 1 次提交

net: remove cmsg restriction from io_uring based send/recvmsg calls · e5493796

由 Jens Axboe 提交于 2月 17, 2021

No need to restrict these anymore, as the worker threads are direct
clones of the original task. Hence we know for a fact that we can
support anything that the regular task can.

Since the only user of proto_ops->flags was to flag PROTO_CMSG_DATA_ONLY,
kill the member and the flag definition too.
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e5493796

03 2月, 2021 1 次提交

net: ipv6: Emit notification when fib hardware flags are changed · 907eea48

由 Amit Cohen 提交于 2月 01, 2021

After installing a route to the kernel, user space receives an
acknowledgment, which means the route was installed in the kernel,
but not necessarily in hardware.

The asynchronous nature of route installation in hardware can lead
to a routing daemon advertising a route before it was actually installed in
hardware. This can result in packet loss or mis-routed packets until the
route is installed in hardware.

It is also possible for a route already installed in hardware to change
its action and therefore its flags. For example, a host route that is
trapping packets can be "promoted" to perform decapsulation following
the installation of an IPinIP/VXLAN tunnel.

Emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags
are changed. The aim is to provide an indication to user-space
(e.g., routing daemons) about the state of the route in hardware.

Introduce a sysctl that controls this behavior.

Keep the default value at 0 (i.e., do not emit notifications) for several
reasons:
- Multiple RTM_NEWROUTE notification per-route might confuse existing
  routing daemons.
- Convergence reasons in routing daemons.
- The extra notifications will negatively impact the insertion rate.
- Not all users are interested in these notifications.

Move fib6_info_hw_flags_set() to C file because it is no longer a short
function.
Signed-off-by: NAmit Cohen <amcohen@nvidia.com>
Signed-off-by: NIdo Schimmel <idosch@nvidia.com>
Reviewed-by: NDavid Ahern <dsahern@kernel.org>
Signed-off-by: NJakub Kicinski <kuba@kernel.org>

907eea48

28 1月, 2021 1 次提交

bpf: Allow rewriting to ports under ip_unprivileged_port_start · 77241217

由 Stanislav Fomichev 提交于 1月 27, 2021

At the moment, BPF_CGROUP_INET{4,6}_BIND hooks can rewrite user_port
to the privileged ones (< ip_unprivileged_port_start), but it will
be rejected later on in the __inet_bind or __inet6_bind.

Let's add another return value to indicate that CAP_NET_BIND_SERVICE
check should be ignored. Use the same idea as we currently use
in cgroup/egress where bit #1 indicates CN. Instead, for
cgroup/bind{4,6}, bit #1 indicates that CAP_NET_BIND_SERVICE should
be bypassed.

v5:
- rename flags to be less confusing (Andrey Ignatov)
- rework BPF_PROG_CGROUP_INET_EGRESS_RUN_ARRAY to work on flags
  and accept BPF_RET_SET_CN (no behavioral changes)

v4:
- Add missing IPv6 support (Martin KaFai Lau)

v3:
- Update description (Martin KaFai Lau)
- Fix capability restore in selftest (Martin KaFai Lau)

v2:
- Switch to explicit return code (Martin KaFai Lau)
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Reviewed-by: NMartin KaFai Lau <kafai@fb.com>
Acked-by: NAndrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/20210127193140.3170382-1-sdf@google.com

77241217

21 1月, 2021 1 次提交

bpf: Split cgroup_bpf_enabled per attach type · a9ed15da

由 Stanislav Fomichev 提交于 1月 15, 2021

When we attach any cgroup hook, the rest (even if unused/unattached) start
to contribute small overhead. In particular, the one we want to avoid is
__cgroup_bpf_run_filter_skb which does two redirections to get to
the cgroup and pushes/pulls skb.

Let's split cgroup_bpf_enabled to be per-attach to make sure
only used attach types trigger.

I've dropped some existing high-level cgroup_bpf_enabled in some
places because BPF_PROG_CGROUP_XXX_RUN macros usually have another
cgroup_bpf_enabled check.

I also had to copy-paste BPF_CGROUP_RUN_SA_PROG_LOCK for
GETPEERNAME/GETSOCKNAME because type for cgroup_bpf_enabled[type]
has to be constant and known at compile time.
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NSong Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210115163501.805133-4-sdf@google.com

a9ed15da

03 12月, 2020 1 次提交

bpf: Allow bpf_{s,g}etsockopt from cgroup bind{4,6} hooks · 427167c0

由 Stanislav Fomichev 提交于 12月 02, 2020

I have to now lock/unlock socket for the bind hook execution.
That shouldn't cause any overhead because the socket is unbound
and shouldn't receive any traffic.
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NAndrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/20201202172516.3483656-3-sdf@google.com

427167c0

24 11月, 2020 1 次提交

lsm,selinux: pass flowi_common instead of flowi to the LSM hooks · 3df98d79

由 Paul Moore 提交于 9月 27, 2020

As pointed out by Herbert in a recent related patch, the LSM hooks do
not have the necessary address family information to use the flowi
struct safely.  As none of the LSMs currently use any of the protocol
specific flowi information, replace the flowi pointers with pointers
to the address family independent flowi_common struct.
Reported-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NJames Morris <jamorris@linux.microsoft.com>
Signed-off-by: NPaul Moore <paul@paul-moore.com>

3df98d79

01 9月, 2020 1 次提交

ipv6: add ipv6_fragment hook in ipv6_stub · 1d97898b

由 wenxu 提交于 8月 28, 2020

Add ipv6_fragment to ipv6_stub to avoid calling netfilter when
access ip6_fragment.
Signed-off-by: Nwenxu <wenxu@ucloud.cn>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1d97898b

25 8月, 2020 1 次提交

io_uring: allow tcp ancillary data for __sys_recvmsg_sock() · 583bbf06

由 Luke Hsiao 提交于 8月 21, 2020

For TCP tx zero-copy, the kernel notifies the process of completions by
queuing completion notifications on the socket error queue. This patch
allows reading these notifications via recvmsg to support TCP tx
zero-copy.

Ancillary data was originally disallowed due to privilege escalation
via io_uring's offloading of sendmsg() onto a kernel thread with kernel
credentials (https://crbug.com/project-zero/1975). So, we must ensure
that the socket type is one where the ancillary data types that are
delivered on recvmsg are plain data (no file descriptors or values that
are translated based on the identity of the calling process).

This was tested by using io_uring to call recvmsg on the MSG_ERRQUEUE
with tx zero-copy enabled. Before this patch, we received -EINVALID from
this specific code path. After this patch, we could read tcp tx
zero-copy completion notifications from the MSG_ERRQUEUE.
Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: NArjun Roy <arjunroy@google.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Reviewed-by: NJann Horn <jannh@google.com>
Reviewed-by: NJens Axboe <axboe@kernel.dk>
Signed-off-by: NLuke Hsiao <lukehsiao@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

583bbf06

20 7月, 2020 1 次提交

net: remove compat_sock_common_{get,set}sockopt · 8c918ffb

由 Christoph Hellwig 提交于 7月 17, 2020

Add the compat handling to sock_common_{get,set}sockopt instead,
keyed of in_compat_syscall().  This allow to remove the now unused
->compat_{get,set}sockopt methods from struct proto_ops.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Acked-by: NMatthieu Baerts <matthieu.baerts@tessares.net>
Acked-by: NStefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8c918ffb

20 5月, 2020 1 次提交

bpf: Add get{peer, sock}name attach types for sock_addr · 1b66d253

由 Daniel Borkmann 提交于 5月 19, 2020

As stated in 983695fa ("bpf: fix unconnected udp hooks"), the objective
for the existing cgroup connect/sendmsg/recvmsg/bind BPF hooks is to be
transparent to applications. In Cilium we make use of these hooks [0] in
order to enable E-W load balancing for existing Kubernetes service types
for all Cilium managed nodes in the cluster. Those backends can be local
or remote. The main advantage of this approach is that it operates as close
as possible to the socket, and therefore allows to avoid packet-based NAT
given in connect/sendmsg/recvmsg hooks we only need to xlate sock addresses.

This also allows to expose NodePort services on loopback addresses in the
host namespace, for example. As another advantage, this also efficiently
blocks bind requests for applications in the host namespace for exposed
ports. However, one missing item is that we also need to perform reverse
xlation for inet{,6}_getname() hooks such that we can return the service
IP/port tuple back to the application instead of the remote peer address.

The vast majority of applications does not bother about getpeername(), but
in a few occasions we've seen breakage when validating the peer's address
since it returns unexpectedly the backend tuple instead of the service one.
Therefore, this trivial patch allows to customise and adds a getpeername()
as well as getsockname() BPF cgroup hook for both IPv4 and IPv6 in order
to address this situation.

Simple example:

  # ./cilium/cilium service list
  ID   Frontend     Service Type   Backend
  1    1.2.3.4:80   ClusterIP      1 => 10.0.0.10:80

Before; curl's verbose output example, no getpeername() reverse xlation:

  # curl --verbose 1.2.3.4
  * Rebuilt URL to: 1.2.3.4/
  *   Trying 1.2.3.4...
  * TCP_NODELAY set
  * Connected to 1.2.3.4 (10.0.0.10) port 80 (#0)
  > GET / HTTP/1.1
  > Host: 1.2.3.4
  > User-Agent: curl/7.58.0
  > Accept: */*
  [...]

After; with getpeername() reverse xlation:

  # curl --verbose 1.2.3.4
  * Rebuilt URL to: 1.2.3.4/
  *   Trying 1.2.3.4...
  * TCP_NODELAY set
  * Connected to 1.2.3.4 (1.2.3.4) port 80 (#0)
  > GET / HTTP/1.1
  >  Host: 1.2.3.4
  > User-Agent: curl/7.58.0
  > Accept: */*
  [...]

Originally, I had both under a BPF_CGROUP_INET{4,6}_GETNAME type and exposed
peer to the context similar as in inet{,6}_getname() fashion, but API-wise
this is suboptimal as it always enforces programs having to test for ctx->peer
which can easily be missed, hence BPF_CGROUP_INET{4,6}_GET{PEER,SOCK}NAME split.
Similarly, the checked return code is on tnum_range(1, 1), but if a use case
comes up in future, it can easily be changed to return an error code instead.
Helper and ctx member access is the same as with connect/sendmsg/etc hooks.

  [0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.cSigned-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NAndrii Nakryiko <andriin@fb.com>
Acked-by: NAndrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/61a479d759b2482ae3efb45546490bacd796a220.1589841594.git.daniel@iogearbox.net

1b66d253

19 5月, 2020 2 次提交

ipv6: move SIOCADDRT and SIOCDELRT handling into ->compat_ioctl · 3986912f

由 Christoph Hellwig 提交于 5月 18, 2020

To prepare removing the global routing_ioctl hack start lifting the code
into a newly added ipv6 ->compat_ioctl handler.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3986912f

ipv6: lift copy_from_user out of ipv6_route_ioctl · 7c1552da

由 Christoph Hellwig 提交于 5月 18, 2020

Prepare for better compat ioctl handling by moving the user copy out
of ipv6_route_ioctl.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7c1552da

09 5月, 2020 2 次提交

bpf: Allow any port in bpf_bind helper · 8086fbaf

由 Stanislav Fomichev 提交于 5月 08, 2020

We want to have a tighter control on what ports we bind to in
the BPF_CGROUP_INET{4,6}_CONNECT hooks even if it means
connect() becomes slightly more expensive. The expensive part
comes from the fact that we now need to call inet_csk_get_port()
that verifies that the port is not used and allocates an entry
in the hash table for it.

Since we can't rely on "snum || !bind_address_no_port" to prevent
us from calling POST_BIND hook anymore, let's add another bind flag
to indicate that the call site is BPF program.

v5:
* fix wrong AF_INET (should be AF_INET6) in the bpf program for v6

v3:
* More bpf_bind documentation refinements (Martin KaFai Lau)
* Add UDP tests as well (Martin KaFai Lau)
* Don't start the thread, just do socket+bind+listen (Martin KaFai Lau)

v2:
* Update documentation (Andrey Ignatov)
* Pass BIND_FORCE_ADDRESS_NO_PORT conditionally (Andrey Ignatov)
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAndrey Ignatov <rdna@fb.com>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200508174611.228805-5-sdf@google.com

8086fbaf

net: Refactor arguments of inet{,6}_bind · cb0721c7

由 Stanislav Fomichev 提交于 5月 08, 2020

The intent is to add an additional bind parameter in the next commit.
Instead of adding another argument, let's convert all existing
flag arguments into an extendable bit field.

No functional changes.
Signed-off-by: NStanislav Fomichev <sdf@google.com>
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAndrey Ignatov <rdna@fb.com>
Acked-by: NMartin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200508174611.228805-4-sdf@google.com

cb0721c7

06 5月, 2020 1 次提交

xfrm: expose local_rxpmtu via ipv6_stubs · 3e50ddd8

由 Florian Westphal 提交于 5月 04, 2020

We cannot call this function from the core kernel unless we would force
CONFIG_IPV6=y.

Therefore expose this via ipv6_stubs so we can call it from net/xfrm
in the followup patch.

Since the call is expected to be unlikely, no extra code for the IPV6=y
case is added and we will always eat the indirection cost.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

3e50ddd8

28 4月, 2020 2 次提交

xfrm: add IPv6 support for espintcp · 26333c37

由 Sabrina Dubroca 提交于 4月 27, 2020

This extends espintcp to support IPv6, building on the existing code
and the new UDPv6 encapsulation support. Most of the code is either
reused directly (stream parser, ULP) or very similar to the IPv4
variant (net/ipv6/esp6.c changes).

The separation of config options for IPv4 and IPv6 espintcp requires a
bit of Kconfig gymnastics to enable the core code.
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

26333c37

xfrm: add support for UDPv6 encapsulation of ESP · 0146dca7

由 Sabrina Dubroca 提交于 4月 27, 2020

This patch adds support for encapsulation of ESP over UDPv6. The code
is very similar to the IPv4 encapsulation implementation, and allows
to easily add espintcp on IPv6 as a follow-up.
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>

0146dca7

30 3月, 2020 1 次提交

net: ipv6: add rpl sr tunnel · a7a29f9c

由 Alexander Aring 提交于 3月 27, 2020

This patch adds functionality to configure routes for RPL source routing
functionality. There is no IPIP functionality yet implemented which can
be added later when the cases when to use IPv6 encapuslation comes more
clear.
Signed-off-by: NAlexander Aring <alex.aring@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a7a29f9c

05 12月, 2019 2 次提交

net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookup · 6c8991f4

由 Sabrina Dubroca 提交于 12月 04, 2019

ipv6_stub uses the ip6_dst_lookup function to allow other modules to
perform IPv6 lookups. However, this function skips the XFRM layer
entirely.

All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the
ip_route_output_key and ip_route_output helpers) for their IPv4 lookups,
which calls xfrm_lookup_route(). This patch fixes this inconsistent
behavior by switching the stub to ip6_dst_lookup_flow, which also calls
xfrm_lookup_route().

This requires some changes in all the callers, as these two functions
take different arguments and have different return types.

Fixes: 5f81bd2e ("ipv6: export a stub for IPv6 symbols used by vxlan")
Reported-by: NXiumei Mu <xmu@redhat.com>
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6c8991f4

net: ipv6: add net argument to ip6_dst_lookup_flow · c4e85f73

由 Sabrina Dubroca 提交于 12月 04, 2019

This will be used in the conversion of ipv6_stub to ip6_dst_lookup_flow,
as some modules currently pass a net argument without a socket to
ip6_dst_lookup. This is equivalent to commit 343d60aa ("ipv6: change
ipv6_stub_impl.ipv6_dst_lookup to take net argument").
Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c4e85f73

27 11月, 2019 1 次提交

net: port < inet_prot_sock(net) --> inet_port_requires_bind_service(net, port) · 82f31ebf

由 Maciej Żenczykowski 提交于 11月 25, 2019

Note that the sysctl write accessor functions guarantee that:
  net->ipv4.sysctl_ip_prot_sock <= net->ipv4.ip_local_ports.range[0]
invariant is maintained, and as such the max() in selinux hooks is actually spurious.

ie. even though
  if (snum < max(inet_prot_sock(sock_net(sk)), low) || snum > high) {
per logic is the same as
  if ((snum < inet_prot_sock(sock_net(sk)) && snum < low) || snum > high) {
it is actually functionally equivalent to:
  if (snum < low || snum > high) {
which is equivalent to:
  if (snum < inet_prot_sock(sock_net(sk)) || snum < low || snum > high) {
even though the first clause is spurious.

But we want to hold on to it in case we ever want to change what what
inet_port_requires_bind_service() means (for example by changing
it from a, by default, [0..1024) range to some sort of set).

Test: builds, git 'grep inet_prot_sock' finds no other references
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: NMaciej Żenczykowski <maze@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82f31ebf

04 7月, 2019 2 次提交

ipv6: use indirect call wrappers for {tcp, udpv6}_{recv, send}msg() · 164c51fe

由 Paolo Abeni 提交于 7月 03, 2019

This avoids an indirect call per syscall for common ipv6 transports
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

164c51fe

ipv6: provide and use ipv6 specific version for {recv, send}msg · 68ab5d14

由 Paolo Abeni 提交于 7月 03, 2019

This will simplify indirect call wrapper invocation in the following
patch.

No functional change intended, any - out-of-tree - IPv6 user of
inet_{recv,send}msg can keep using the existing functions.

SCTP code still uses the existing version even for ipv6: as this series
will not add ICW for SCTP, moving to the new helper would not give
any benefit.

The only other in-kernel user of inet_{recv,send}msg is
pvcalls_conn_back_read(), but psvcalls explicitly creates only IPv4 socket,
so no need to update that code path, too.

v1 -> v2: drop inet6_{recv,send}msg declaration from header file,
   prefer ICW macro instead
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68ab5d14

02 7月, 2019 1 次提交

ipv6: icmp: allow flowlabel reflection in echo replies · a346abe0

由 Eric Dumazet 提交于 7月 01, 2019

Extend flowlabel_reflect bitmask to allow conditional
reflection of incoming flowlabels in echo replies.

Note this has precedence against auto flowlabels.

Add flowlabel_reflect enum to replace hard coded
values.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a346abe0

06 6月, 2019 1 次提交

ipv6: tcp: enable flowlabel reflection in some RST packets · 323a53c4

由 Eric Dumazet 提交于 6月 05, 2019

When RST packets are sent because no socket could be found,
it makes sense to use flowlabel_reflect sysctl to decide
if a reflection of the flowlabel is requested.

This extends commit 22b6722b ("ipv6: Add sysctl for per
namespace flow label reflection"), for some TCP RST packets.

In order to provide full control of this new feature,
flowlabel_reflect becomes a bitmask.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

323a53c4

31 5月, 2019 1 次提交

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 · 2874c5fd

由 Thomas Gleixner 提交于 5月 27, 2019

Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
Reviewed-by: NAllison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.deSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

2874c5fd

23 5月, 2019 3 次提交

ipv6: export function to send route updates · 19a3b7ee

由 David Ahern 提交于 5月 22, 2019

Add fib6_rt_update to send RTM_NEWROUTE with NLM_F_REPLACE set. This
helper will be used by the nexthop code to notify userspace of routes
that are impacted when a nexthop config is updated via replace.

This notification is needed for legacy apps that do not understand
the new nexthop object. Apps that are nexthop aware can use the
RTA_NH_ID attribute in the route notification to just ignore it.

In the future this should be wrapped in a sysctl to allow OS'es that
are fully updated to avoid the notificaton storm.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

19a3b7ee

ipv6: Add hook to bump sernum for a route to stubs · cdaa16a4

由 David Ahern 提交于 5月 22, 2019

Add hook to ipv6 stub to bump the sernum up to the root node for a
route. This is needed by the nexthop code when a nexthop config changes.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cdaa16a4

ipv6: Add delete route hook to stubs · 68a9b13d

由 David Ahern 提交于 5月 22, 2019

Add ip6_del_rt to the IPv6 stub. The hook is needed by the nexthop
code to remove entries linked to a nexthop that is getting deleted.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

68a9b13d

20 4月, 2019 1 次提交

net: rework SIOCGSTAMP ioctl handling · c7cbdbf2

由 Arnd Bergmann 提交于 4月 17, 2019

The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many
socket protocol handlers, and all of those end up calling the same
sock_get_timestamp()/sock_get_timestampns() helper functions, which
results in a lot of duplicate code.

With the introduction of 64-bit time_t on 32-bit architectures, this
gets worse, as we then need four different ioctl commands in each
socket protocol implementation.

To simplify that, let's add a new .gettstamp() operation in
struct proto_ops, and move ioctl implementation into the common
sock_ioctl()/compat_sock_ioctl_trans() functions that these all go
through.

We can reuse the sock_get_timestamp() implementation, but generalize
it so it can deal with both native and compat mode, as well as
timeval and timespec structures.
Acked-by: NStefan Schmidt <stefan@datenfreihafen.org>
Acked-by: NNeil Horman <nhorman@tuxdriver.com>
Acked-by: NMarc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/lkml/CAK8P3a038aDQQotzua_QtKGhq8O9n+rdiz2=WDCp82ys8eUT+A@mail.gmail.com/Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c7cbdbf2

19 4月, 2019 1 次提交

ipv6: Add rate limit mask for ICMPv6 messages · 0bc19985

由 Stephen Suryaputra 提交于 4月 17, 2019

To make ICMPv6 closer to ICMPv4, add ratemask parameter. Since the ICMP
message types use larger numeric values, a simple bitmask doesn't fit.
I use large bitmap. The input and output are the in form of list of
ranges. Set the default to rate limit all error messages but Packet Too
Big. For Packet Too Big, use ratemask instead of hard-coded.

There are functions where icmpv6_xrlim_allow() and icmpv6_global_allow()
aren't called. This patch only adds them to icmpv6_echo_reply().

Rate limiting error messages is mandated by RFC 4443 but RFC 4890 says
that it is also acceptable to rate limit informational messages. Thus,
I removed the current hard-coded behavior of icmpv6_mask_allow() that
doesn't rate limit informational messages.

v2: Add dummy function proc_do_large_bitmap() if CONFIG_PROC_SYSCTL
isn't defined, expand the description in ip-sysctl.txt and remove
unnecessary conditional before kfree().
v3: Inline the bitmap instead of dynamically allocated. Still is a
pointer to it is needed because of the way proc_do_large_bitmap work.
Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0bc19985

18 4月, 2019 1 次提交

ipv6: Rename fib6_multipath_select and pass fib6_result · b1d40991

由 David Ahern 提交于 4月 16, 2019

Add 'struct fib6_result' to hold the fib entry and fib6_nh from a fib
lookup as separate entries, similar to what IPv4 now has with fib_result.

Rename fib6_multipath_select to fib6_select_path, pass fib6_result to
it, and set f6i and nh in the result once a path selection is done.
Call fib6_select_path unconditionally for path selection which means
moving the sibling and oif check to fib6_select_path. To handle the two
different call paths (2 only call multipath_select if flowi6_oif == 0 and
the other always calls it), add a new have_oif_match that controls the
sibling walk if relevant.

Update callers of fib6_multipath_select accordingly and have them use the
fib6_info and fib6_nh from the result.

This is needed for multipath nexthop objects where a single f6i can
point to multiple fib6_nh (similar to IPv4).
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b1d40991

09 4月, 2019 1 次提交

ipv6: Add fib6_nh_init and release to stubs · 1aefd3de

由 David Ahern 提交于 4月 05, 2019

Add fib6_nh_init and fib6_nh_release to ipv6_stubs. If fib6_nh_init fails,
callers should not invoke fib6_nh_release, so there is no reason to have
a dummy stub for the IPv6 is not enabled case.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Reviewed-by: NIdo Schimmel <idosch@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1aefd3de

30 3月, 2019 1 次提交

ipv6: Move ipv6 stubs to a separate header file · 3616d08b

由 David Ahern 提交于 3月 22, 2019

The number of stubs is growing and has nothing to do with addrconf.
Move the definition of the stubs to a separate header file and update
users. In the move, drop the vxlan specific comment before ipv6_stub.

Code move only; no functional change intended.
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3616d08b

21 3月, 2019 1 次提交

ipv6: Add icmp_echo_ignore_anycast for ICMPv6 · 0b03a5ca

由 Stephen Suryaputra 提交于 3月 20, 2019

In addition to icmp_echo_ignore_multicast, there is a need to also
prevent responding to pings to anycast addresses for security.
Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0b03a5ca

20 3月, 2019 1 次提交

ipv6: Add icmp_echo_ignore_multicast support for ICMPv6 · 03f1eccc

由 Stephen Suryaputra 提交于 3月 19, 2019

IPv4 has icmp_echo_ignore_broadcast to prevent responding to broadcast pings.
IPv6 needs a similar mechanism.

v1->v2:
- Remove NET_IPV6_ICMP_ECHO_IGNORE_MULTICAST.
Signed-off-by: NStephen Suryaputra <ssuryaextr@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

03f1eccc

14 2月, 2019 1 次提交

ipv6_stub: add ipv6_route_input stub/proxy. · 9b0a6a9d

由 Peter Oskolkov 提交于 2月 13, 2019

Proxy ip6_route_input via ipv6_stub, for later use by lwt bpf ip encap
(see the next patch in the patchset).
Signed-off-by: NPeter Oskolkov <posk@google.com>
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>

9b0a6a9d

06 1月, 2019 1 次提交

ipv6: Take rcu_read_lock in __inet6_bind for mapped addresses · d4a7e9bb

由 David Ahern 提交于 1月 05, 2019

I realized the last patch calls dev_get_by_index_rcu in a branch not
holding the rcu lock. Add the calls to rcu_read_lock and rcu_read_unlock.

Fixes: ec90ad33 ("ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address")
Signed-off-by: NDavid Ahern <dsahern@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d4a7e9bb

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功