提交 · 66388f2c08dfa38071f9eceae7bb29060d9be9aa · openanolis / cloud-kernel

19 9月, 2016 2 次提交

sctp: do not return the transmit err back to sctp_sendmsg · 66388f2c

由 Xin Long 提交于 9月 14, 2016

Once a chunk is enqueued successfully, sctp queues can take care of it.
Even if it is failed to transmit (like because of nomem), it should be
put into retransmit queue.

If sctp report this error to users, it confuses them, they may resend
that msg, but actually in kernel sctp stack is in charge of retransmit
it already.

Besides, this error probably is not from the failure of transmitting
current msg, but transmitting or retransmitting another msg's chunks,
as sctp_outq_flush just tries to send out all transports' chunks.

This patch is to make sctp_cmd_send_msg return avoid, and not return the
transmit err back to sctp_sendmsg

Fixes: 8b570dc9 ("sctp: only drop the reference on the datamsg after sending a msg")
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

66388f2c

sctp: remove the unnecessary state check in sctp_outq_tail · 2c89791e

由 Xin Long 提交于 9月 14, 2016

Data Chunks are only sent by sctp_primitive_SEND, in which sctp checks
the asoc's state through statetable before calling sctp_outq_tail. So
there's no need to check the asoc's state again in sctp_outq_tail.

Besides, sctp_do_sm is protected by lock_sock, even if sending msg is
interrupted by timer events, the event's processes still need to acquire
lock_sock first. It means no others CMDs can be enqueue into side effect
list before CMD_SEND_MSG to change asoc->state, so it's safe to remove it.

This patch is to remove redundant asoc->state check from sctp_outq_tail.
Signed-off-by: NXin Long <lucien.xin@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2c89791e

17 9月, 2016 6 次提交

ip6_tunnel: add collect_md mode to IPv6 tunnels · 8d79266b

由 Alexei Starovoitov 提交于 9月 15, 2016

Similar to gre, vxlan, geneve tunnels allow IPIP6 and IP6IP6 tunnels
to operate in 'collect metadata' mode.
Unlike ipv4 code here it's possible to reuse ip6_tnl_xmit() function
for both collect_md and traditional tunnels.
bpf_skb_[gs]et_tunnel_key() helpers and ovs (in the future) are the users.
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8d79266b

ip_tunnel: add collect_md mode to IPIP tunnel · cfc7381b

由 Alexei Starovoitov 提交于 9月 15, 2016

Similar to gre, vxlan, geneve tunnels allow IPIP tunnels to
operate in 'collect metadata' mode.
bpf_skb_[gs]et_tunnel_key() helpers can make use of it right away.
ovs can use it as well in the future (once appropriate ovs-vport
abstractions and user apis are added).
Note that just like in other tunnels we cannot cache the dst,
since tunnel_info metadata can be different for every packet.
Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NThomas Graf <tgraf@suug.ch>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cfc7381b

l2tp: constify net_device_ops structures · eb94737d

由 Julia Lawall 提交于 9月 15, 2016

Check for net_device_ops structures that are only stored in the netdev_ops
field of a net_device structure.  This field is declared const, so
net_device_ops structures that have this property can be declared as const
also.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct net_device_ops i@p = { ... };

@ok@
identifier r.i;
struct net_device e;
position p;
@@
e.netdev_ops = &i@p;

@bad@
position p != {r.p,ok.p};
identifier r.i;
struct net_device_ops e;
@@
e@i@p

@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
 struct net_device_ops i = { ... };
// </smpl>

The result of size on this file before the change is:
   text	      data     bss     dec         hex	  filename
   3401        931      44    4376        1118	net/l2tp/l2tp_eth.o

and after the change it is:
   text	     data        bss	    dec	    hex	filename
   3993       347         44       4384    1120	net/l2tp/l2tp_eth.o
Signed-off-by: NJulia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb94737d

llc: switch type to bool as the timeout is only tested versus 0 · 5ff904d5

由 Alan Cox 提交于 9月 15, 2016

(As asked by Dave in Februrary)
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5ff904d5

tcp: prepare skbs for better sack shifting · 3613b3db

由 Eric Dumazet 提交于 9月 15, 2016

With large BDP TCP flows and lossy networks, it is very important
to keep a low number of skbs in the write queue.

RACK and SACK processing can perform a linear scan of it.

We should avoid putting any payload in skb->head, so that SACK
shifting can be done if needed.

With this patch, we allow to pack ~0.5 MB per skb instead of
the 64KB initially cooked at tcp_sendmsg() time.

This gives a reduction of number of skbs in write queue by eight.
tcp_rack_detect_loss() likes this.

We still allow payload in skb->head for first skb put in the queue,
to not impact RPC workloads.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3613b3db

rxrpc: Make IPv6 support conditional on CONFIG_IPV6 · d1912747

由 David Howells 提交于 9月 17, 2016

Add CONFIG_AF_RXRPC_IPV6 and make the IPv6 support code conditional on it.
This is then made conditional on CONFIG_IPV6.

Without this, the following can be seen:

   net/built-in.o: In function `rxrpc_init_peer':
>> peer_object.c:(.text+0x18c3c8): undefined reference to `ip6_route_output_flags'
Reported-by: Nkbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d1912747

16 9月, 2016 9 次提交

net-next: dsa: add Qualcomm tag RX/TX handler · cafdc45c

由 John Crispin 提交于 9月 15, 2016

Add support for the 2-bytes Qualcomm tag that gigabit switches such as
the QCA8337/N might insert when receiving packets, or that we need
to insert while targeting specific switch ports. The tag is inserted
directly behind the ethernet header.
Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
Signed-off-by: NJohn Crispin <john@phrozen.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cafdc45c

tcp: fix a stale ooo_last_skb after a replace · 76f0dcbb

由 Eric Dumazet 提交于 9月 13, 2016

When skb replaces another one in ooo queue, I forgot to also
update tp->ooo_last_skb as well, if the replaced skb was the last one
in the queue.

To fix this, we simply can re-use the code that runs after an insertion,
trying to merge skbs at the right of current skb.

This not only fixes the bug, but also remove all small skbs that might
be a subset of the new one.

Example:

We receive segments 2001:3001,  4001:5001

Then we receive 2001:8001 : We should replace 2001:3001 with the big
skb, but also remove 4001:50001 from the queue to save space.

packetdrill test demonstrating the bug

0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0

+0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
+0.100 < . 1:1(0) ack 1 win 1024
+0 accept(3, ..., ...) = 4

+0.01 < . 1001:2001(1000) ack 1 win 1024
+0    > . 1:1(0) ack 1 <nop,nop, sack 1001:2001>

+0.01 < . 1001:3001(2000) ack 1 win 1024
+0    > . 1:1(0) ack 1 <nop,nop, sack 1001:2001 1001:3001>

Fixes: 9f5afeae ("tcp: use an RB tree for ooo receive queue")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NYuchung Cheng <ycheng@google.com>
Cc: Yaogong Wang <wygivan@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

76f0dcbb

openvswitch: avoid deferred execution of recirc actions · 2679d040

由 Lance Richardson 提交于 9月 13, 2016

The ovs kernel data path currently defers the execution of all
recirc actions until stack utilization is at a minimum.
This is too limiting for some packet forwarding scenarios due to
the small size of the deferred action FIFO (10 entries). For
example, broadcast traffic sent out more than 10 ports with
recirculation results in packet drops when the deferred action
FIFO becomes full, as reported here:

     http://openvswitch.org/pipermail/dev/2016-March/067672.html

Since the current recursion depth is available (it is already tracked
by the exec_actions_level pcpu variable), we can use it to determine
whether to execute recirculation actions immediately (safe when
recursion depth is low) or defer execution until more stack space is
available.

With this change, the deferred action fifo size becomes a non-issue
for currently failing scenarios because it is no longer used when
there are three or fewer recursions through ovs_execute_actions().
Suggested-by: NPravin Shelar <pshelar@ovn.org>
Signed-off-by: NLance Richardson <lrichard@redhat.com>
Acked-by: NPravin B Shelar <pshelar@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2679d040

net/sched: cls_flower: Remove an unused field from the filter key structure · a53d850a

由 Or Gerlitz 提交于 9月 15, 2016

Commit c3f83241 "net: Add full IPv6 addresses to flow_keys" added an
unused instance of struct flow_dissector_key_addrs into struct fl_flow_key,
remove it.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Reported-by: NHadar Hen Zion <hadarh@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a53d850a

net/sched: cls_flower: Support masking for matching on tcp/udp ports · aa72d708

由 Or Gerlitz 提交于 9月 15, 2016

Add the definitions for src/dst udp/tcp port masks and use
them when setting && dumping the relevant keys.
Signed-off-by: NOr Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: NPaul Blakey <paulb@mellanox.com>
Acked-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa72d708

net_sched: Introduce skbmod action · 86da71b5

由 Jamal Hadi Salim 提交于 9月 12, 2016

This action is intended to be an upgrade from a usability perspective
from pedit (as well as operational debugability).
Compare this:

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action pedit munge offset -14 u8 set 0x02 \
munge offset -13 u8 set 0x15 \
munge offset -12 u8 set 0x15 \
munge offset -11 u8 set 0x15 \
munge offset -10 u16 set 0x1515 \
pipe

to:

sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action skbmod dmac 02:15:15:15:15:15

Also try to do a MAC address swap with pedit or worse
try to debug a policy with destination mac, source mac and
etherype. Then make few rules out of those and you'll get my point.

In the future common use cases on pedit can be migrated to this action
(as an example different fields in ip v4/6, transports like tcp/udp/sctp
etc). For this first cut, this allows modifying basic ethernet header.

The most important ethernet use case at the moment is when redirecting or
mirroring packets to a remote machine. The dst mac address needs a re-write
so that it doesnt get dropped or confuse an interconnecting (learning) switch
or dropped by a target machine (which looks at the dst mac). And at times
when flipping back the packet a swap of the MAC addresses is needed.
Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

86da71b5

bpf: use skb_at_tc_ingress helper in tcf_bpf · f53d8c7b

由 Daniel Borkmann 提交于 9月 12, 2016

We have a small skb_at_tc_ingress() helper for testing for ingress, so
make use of it. cls_bpf already uses it and so should act_bpf.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f53d8c7b

bpf: drop unnecessary test in cls_bpf_classify and tcf_bpf · 04b3f8de

由 Daniel Borkmann 提交于 9月 12, 2016

The skb_mac_header_was_set() test in cls_bpf's and act_bpf's fast-path is
actually unnecessary and can be removed altogether. This was added by
commit a166151c ("bpf: fix bpf helpers to use skb->mac_header relative
offsets"), which was later on improved by 3431205e ("bpf: make programs
see skb->data == L2 for ingress and egress"). We're always guaranteed to
have valid mac header at the time we invoke cls_bpf_classify() or tcf_bpf().

Reason is that since 6d1ccff6 ("net: reset mac header in dev_start_xmit()")
we do skb_reset_mac_header() in __dev_queue_xmit() before we could call
into sch_handle_egress() or any subsequent enqueue. sch_handle_ingress()
always sees a valid mac header as well (things like skb_reset_mac_len()
would badly fail otherwise). Thus, drop the unnecessary test in classifier
and action case.
Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

04b3f8de

net/sched: act_tunnel_key: Remove rcu_read_lock protection · 07c0f09e

由 Hadar Hen Zion 提交于 9月 12, 2016

Remove rcu_read_lock protection from tunnel_key_dump and use
rtnl_dereference, dump operation is protected by  rtnl lock.

Also, remove rcu_read_lock from tunnel_key_release and use
rcu_dereference_protected.

Both operations are running exclusively and a writer couldn't modify
t->params while those functions are executed.

Fixes: 54d94fd89d90 ('net/sched: Introduce act_tunnel_key')
Signed-off-by: NHadar Hen Zion <hadarh@mellanox.com>
Acked-by: NJohn Fastabend <john.r.fastabend@intel.com>
Acked-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

07c0f09e

14 9月, 2016 14 次提交

rxrpc: Add IPv6 support · 75b54cb5

由 David Howells 提交于 9月 13, 2016

Add IPv6 support to AF_RXRPC.  With this, AF_RXRPC sockets can be created:

	service = socket(AF_RXRPC, SOCK_DGRAM, PF_INET6);

instead of:

	service = socket(AF_RXRPC, SOCK_DGRAM, PF_INET);

The AFS filesystem doesn't support IPv6 at the moment, though, since that
requires upgrades to some of the RPC calls.

Note that a good portion of this patch is replacing "%pI4:%u" in print
statements with "%pISpc" which is able to handle both protocols and print
the port.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

75b54cb5

rxrpc: Use rxrpc_extract_addr_from_skb() rather than doing this manually · 1c2bc7b9

由 David Howells 提交于 9月 13, 2016

There are two places that want to transmit a packet in response to one just
received and manually pick the address to reply to out of the sk_buff.
Make them use rxrpc_extract_addr_from_skb() instead so that IPv6 is handled
automatically.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

1c2bc7b9

rxrpc: Don't specify protocol to when creating transport socket · aaa31cbc

由 David Howells 提交于 9月 13, 2016

Pass 0 as the protocol argument when creating the transport socket rather
than IPPROTO_UDP.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

aaa31cbc

rxrpc: Create an address for sendmsg() to bind unbound socket with · cd5892c7

由 David Howells 提交于 9月 13, 2016

Create an address for sendmsg() to bind unbound socket with rather than
using a completely blank address otherwise the transport socket creation
will fail because it will try to use address family 0.

We use the address family specified in the protocol argument when the
AF_RXRPC socket was created and SOCK_DGRAM as the default.  For anything
else, bind() must be used.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

cd5892c7

rxrpc: Correctly initialise, limit and transmit call->rx_winsize · 75e42126

由 David Howells 提交于 9月 13, 2016

call->rx_winsize should be initialised to the sysctl setting and the sysctl
setting should be limited to the maximum we want to permit. Further, we
need to place this in the ACK info instead of the sysctl setting.

Furthermore, discard the idea of accepting the subpackets of a jumbo packet
that lie beyond the receive window when the first packet of the jumbo is
within the window. Just discard the excess subpackets instead. This
allows the receive window to be opened up right to the buffer size less one
for the dead slot.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

75e42126

rxrpc: Fix prealloc refcounting · 3432a757

由 David Howells 提交于 9月 13, 2016

The preallocated call buffer holds a ref on the calls within that buffer.
The ref was being released in the wrong place - it worked okay for incoming
calls to the AFS cache manager service, but doesn't work right for incoming
calls to a userspace service.

Instead of releasing an extra ref service calls in rxrpc_release_call(),
the ref needs to be released during the acceptance/rejectance process.  To
this end:

 (1) The prealloc ref is now normally released during
     rxrpc_new_incoming_call().

 (2) For preallocated kernel API calls, the kernel API's ref needs to be
     released when the call is discarded on socket close.

 (3) We shouldn't take a second ref in rxrpc_accept_call().

 (4) rxrpc_recvmsg_new_call() needs to get a ref of its own when it adds
     the call to the to_be_accepted socket queue.

In doing (4) above, we would prefer not to put the call's refcount down to
0 as that entails doing cleanup in softirq context, but it's unlikely as
there are several refs held elsewhere, at least one of which must be put by
someone in process context calling rxrpc_release_call().  However, it's not
a problem if we do have to do that.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

3432a757

rxrpc: Adjust the call ref tracepoint to show kernel API refs · cbd00891

由 David Howells 提交于 9月 13, 2016

Adjust the call ref tracepoint to show references held on a call by the
kernel API separately as much as possible and add an additional trace to at
the allocation point from the preallocation buffer for an incoming call.

Note that this doesn't show the allocation of a client call for the kernel
separately at the moment.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

cbd00891

rxrpc: Allow tx_winsize to grow in response to an ACK · 01fd0742

由 David Howells 提交于 9月 13, 2016

Allow tx_winsize to grow when the ACK info packet shows a larger receive
window at the other end rather than only permitting it to shrink.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

01fd0742

rxrpc: Use skb->len not skb->data_len · 89a80ed4

由 David Howells 提交于 9月 13, 2016

skb->len should be used rather than skb->data_len when referring to the
amount of data in a packet.  This will only cause a malfunction in the
following cases:

 (1) We receive a jumbo packet (validation and splitting both are wrong).

 (2) We see if there's extra ACK info in an ACK packet (we think it's not
     there and just ignore it).
Signed-off-by: NDavid Howells <dhowells@redhat.com>

89a80ed4

rxrpc: Add missing unlock in rxrpc_call_accept() · b25de360

由 David Howells 提交于 9月 13, 2016

Add a missing unlock in rxrpc_call_accept() in the path taken if there's no
call to wake up.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

b25de360

rxrpc: Requeue call for recvmsg if more data · 33b603fd

由 David Howells 提交于 9月 13, 2016

rxrpc_recvmsg() needs to make sure that the call it has just been
processing gets requeued for further attention if the buffer has been
filled and there's more data to be consumed. The softirq producer only
queues the call and wakes the socket if it fills the first slot in the
window, so userspace might end up sleeping forever otherwise, despite there
being data available.

This is not a problem provided the userspace buffer is big enough or it
empties the buffer completely before more data comes in.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

33b603fd

rxrpc: The IDLE ACK packet should use rxrpc_idle_ack_delay · 91c2c7b6

由 David Howells 提交于 9月 13, 2016

The IDLE ACK packet should use the rxrpc_idle_ack_delay setting when the
timer is set for it.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

91c2c7b6

rxrpc: Add missing wakeup on Tx window rotation · bc4abfcf

由 David Howells 提交于 9月 13, 2016

We need to wake up the sender when Tx window rotation due to an incoming
ACK makes space in the buffer otherwise the sender is liable to just hang
endlessly.

This problem isn't noticeable if the Tx phase transfers no more than will
fit in a single window or the Tx window rotates fast enough that it doesn't
get full.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

bc4abfcf

rxrpc: Make sure we initialise the peer hash key · 08a39685

由 David Howells 提交于 9月 13, 2016

Peer records created for incoming connections weren't getting their hash
key set.  This meant that incoming calls wouldn't see more than one DATA
packet - which is not a problem for AFS CM calls with small request data
blobs.
Signed-off-by: NDavid Howells <dhowells@redhat.com>

08a39685

13 9月, 2016 2 次提交

tipc: fix possible memory leak in tipc_udp_enable() · c20cb811

由 Wei Yongjun 提交于 9月 10, 2016

'ub' is malloced in tipc_udp_enable() and should be freed before
leaving from the error handling cases, otherwise it will cause
memory leak.

Fixes: ba5aa84a ("tipc: split UDP nl address parsing")
Signed-off-by: NWei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c20cb811

net: bridge: add helper to call /sbin/bridge-stp · 30843315

由 Vivien Didelot 提交于 9月 08, 2016

If /sbin/bridge-stp is available on the system, bridge tries to execute
it instead of the kernel implementation when starting/stopping STP.

If anything goes wrong with /sbin/bridge-stp, bridge silently falls back
to kernel STP, making hard to debug userspace STP.

This patch adds a br_stp_call_user helper to start/stop userspace STP
and debug errors from the program: abnormal exit status is stored in the
lower byte and normal exit status is stored in higher byte.

Below is a simple example on a kernel with dynamic debug enabled:

    # ln -s /bin/false /sbin/bridge-stp
    # brctl stp br0 on
    br0: failed to start userspace STP (256)
    # dmesg
    br0: /sbin/bridge-stp exited with code 1
    br0: failed to start userspace STP (256)
    br0: using kernel STP
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

30843315

11 9月, 2016 7 次提交

net: ipv6: Remove l3mdev_get_saddr6 · 8a966fc0

由 David Ahern 提交于 9月 10, 2016

No longer needed
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

8a966fc0

net: ipv4: Remove l3mdev_get_saddr · d66f6c0a

由 David Ahern 提交于 9月 10, 2016

No longer needed
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d66f6c0a

net: l3mdev: remove redundant calls · e0d56fdd

由 David Ahern 提交于 9月 10, 2016

A previous patch added l3mdev flow update making these hooks
redundant. Remove them.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e0d56fdd

net: vrf: Flip IPv6 output path from FIB lookup hook to out hook · 4c1feac5

由 David Ahern 提交于 9月 10, 2016

Flip the IPv6 output path to use the l3mdev tx out hook. The VRF dst
is not returned on the first FIB lookup. Instead, the dst on the
skb is switched at the beginning of the IPv6 output processing to
send the packet to the VRF driver on xmit.

Link scope addresses (linklocal and multicast) need special handling:
specifically the oif the flow struct can not be changed because we
want the lookup tied to the enslaved interface. ie., the source address
and the returned route MUST point to the interface scope passed in.
Convert the existing vrf_get_rt6_dst to handle only link scope addresses.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c1feac5

net: vrf: Flip IPv4 output path from FIB lookup hook to out hook · ebfc102c

由 David Ahern 提交于 9月 10, 2016

Flip the IPv4 output path to use the l3mdev tx out hook. The VRF dst
is not returned on the first FIB lookup. Instead, the dst on the
skb is switched at the beginning of the IPv4 output processing to
send the packet to the VRF driver on xmit.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ebfc102c

net: l3mdev: Allow the l3mdev to be a loopback · 5f02ce24

由 David Ahern 提交于 9月 10, 2016

Allow an L3 master device to act as the loopback for that L3 domain.
For IPv4 the device can also have the address 127.0.0.1.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f02ce24

net: l3mdev: Add hook to output path · a8e3e1a9

由 David Ahern 提交于 9月 10, 2016

This patch adds the infrastructure to the output path to pass an skb
to an l3mdev device if it has a hook registered. This is the Tx parallel
to l3mdev_ip{6}_rcv in the receive path and is the basis for removing
the existing hook that returns the vrf dst on the fib lookup.
Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a8e3e1a9

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功