提交 · 8e6ce7ebeb34f0992f56de078c3744fb383657fa · openeuler / Kernel

17 7月, 2016 1 次提交

vlan: use a valid default mtu value for vlan over macsec · 18d3df3e

由 Paolo Abeni 提交于 7月 14, 2016

macsec can't cope with mtu frames which need vlan tag insertion, and
vlan device set the default mtu equal to the underlying dev's one.
By default vlan over macsec devices use invalid mtu, dropping
all the large packets.
This patch adds a netif helper to check if an upper vlan device
needs mtu reduction. The helper is used during vlan devices
initialization to set a valid default and during mtu updating to
forbid invalid, too bit, mtu values.
The helper currently only check if the lower dev is a macsec device,
if we get more users, we need to update only the helper (possibly
reserving an additional IFF bit).
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

18d3df3e

16 7月, 2016 1 次提交

tcp: enable per-socket rate limiting of all 'challenge acks' · 083ae308

由 Jason Baron 提交于 7月 14, 2016

The per-socket rate limit for 'challenge acks' was introduced in the
context of limiting ack loops:

commit f2b2c582 ("tcp: mitigate ACK loops for connections as tcp_sock")

And I think it can be extended to rate limit all 'challenge acks' on a
per-socket basis.

Since we have the global tcp_challenge_ack_limit, this patch allows for
tcp_challenge_ack_limit to be set to a large value and effectively rely on
the per-socket limit, or set tcp_challenge_ack_limit to a lower value and
still prevents a single connections from consuming the entire challenge ack
quota.

It further moves in the direction of eliminating the global limit at some
point, as Eric Dumazet has suggested. This a follow-up to:
Subject: tcp: make challenge acks less predictable

Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Yue Cao <ycao009@ucr.edu>
Signed-off-by: NJason Baron <jbaron@akamai.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

083ae308

14 7月, 2016 2 次提交

dccp: limit sk_filter trim to payload · 4f0c40d9

由 Willem de Bruijn 提交于 7月 12, 2016

Dccp verifies packet integrity, including length, at initial rcv in
dccp_invalid_packet, later pulls headers in dccp_enqueue_skb.

A call to sk_filter in-between can cause __skb_pull to wrap skb->len.
skb_copy_datagram_msg interprets this as a negative value, so
(correctly) fails with EFAULT. The negative length is reported in
ioctl SIOCINQ or possibly in a DCCP_WARN in dccp_close.

Introduce an sk_receive_skb variant that caps how small a filter
program can trim packets, and call this in dccp with the header
length. Excessively trimmed packets are now processed normally and
queued for reception as 0B payloads.

Fixes: 7c657876 ("[DCCP]: Initial implementation")
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4f0c40d9

rose: limit sk_filter trim to payload · f4979fce

由 Willem de Bruijn 提交于 7月 12, 2016

Sockets can have a filter program attached that drops or trims
incoming packets based on the filter program return value.

Rose requires data packets to have at least ROSE_MIN_LEN bytes. It
verifies this on arrival in rose_route_frame and unconditionally pulls
the bytes in rose_recvmsg. The filter can trim packets to below this
value in-between, causing pull to fail, leaving the partial header at
the time of skb_copy_datagram_msg.

Place a lower bound on the size to which sk_filter may trim packets
by introducing sk_filter_trim_cap and call this for rose packets.
Signed-off-by: NWillem de Bruijn <willemb@google.com>
Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f4979fce

12 7月, 2016 8 次提交

netfilter: conntrack: skip clash resolution if nat is in place · 590b52e1

由 Pablo Neira Ayuso 提交于 7月 11, 2016

The clash resolution is not easy to apply if the NAT table is
registered. Even if no NAT rules are installed, the nul-binding ensures
that a unique tuple is used, thus, the packet that loses race gets a
different source port number, as described by:

http://marc.info/?l=netfilter-devel&m=146818011604484&w=2

Clash resolution with NAT is also problematic if addresses/port range
ports are used since the conntrack that wins race may describe a
different mangling that we may have earlier applied to the packet via
nf_nat_setup_info().

Fixes: 71d8c47f ("netfilter: conntrack: introduce clash resolution on insertion race")
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
Tested-by: NMarc Dionne <marc.c.dionne@gmail.com>

590b52e1

tipc: reset all unicast links when broadcast send link fails · 1fc07f3e

由 Jon Paul Maloy 提交于 7月 11, 2016

In test situations with many nodes and a heavily stressed system we have
observed that the transmission broadcast link may fail due to an
excessive number of retransmissions of the same packet. In such
situations we need to reset all unicast links to all peers, in order to
reset and re-synchronize the broadcast link.

In this commit, we add a new function tipc_bearer_reset_all() to be used
in such situations. The function scans across all bearers and resets all
their pertaining links.
Acked-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1fc07f3e

tipc: ensure correct broadcast send buffer release when peer is lost · a71eb720

由 Jon Paul Maloy 提交于 7月 11, 2016

After a new receiver peer has been added to the broadcast transmission
link, we allow immediate transmission of new broadcast packets, trusting
that the new peer will not accept the packets until it has received the
previously sent unicast broadcast initialiation message. In the same
way, the sender must not accept any acknowledges until it has itself
received the broadcast initialization from the peer, as well as
confirmation of the reception of its own initialization message.

Furthermore, when a receiver peer goes down, the sender has to produce
the missing acknowledges from the lost peer locally, in order ensure
correct release of the buffers that were expected to be acknowledged by
the said peer.

In a highly stressed system we have observed that contact with a peer
may come up and be lost before the above mentioned broadcast initial-
ization and confirmation have been received. This leads to the locally
produced acknowledges being rejected, and the non-acknowledged buffers
to linger in the broadcast link transmission queue until it fills up
and the link goes into permanent congestion.

In this commit, we remedy this by temporarily setting the corresponding
broadcast receive link state to ESTABLISHED and the 'bc_peer_is_up'
state to true before we issue the local acknowledges. This ensures that
those acknowledges will always be accepted. The mentioned state values
are restored immediately afterwards when the link is reset.
Acked-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a71eb720

tipc: extend broadcast link initialization criteria · 2d18ac4b

由 Jon Paul Maloy 提交于 7月 11, 2016

At first contact between two nodes, an endpoint might sometimes have
time to send out a LINK_PROTOCOL/STATE packet before it has received
the broadcast initialization packet from the peer, i.e., before it has
received a valid broadcast packet number to add to the 'bc_ack' field
of the protocol message.

This means that the peer endpoint will receive a protocol packet with an
invalid broadcast acknowledge value of 0. Under unlucky circumstances
this may lead to the original, already received acknowledge value being
overwritten, so that the whole broadcast link goes stale after a while.

We fix this by delaying the setting of the link field 'bc_peer_is_up'
until we know that the peer really has received our own broadcast
initialization message. The latter is always sent out as the first
unicast message on a link, and always with seqeunce number 1. Because
of this, we only need to look for a non-zero unicast acknowledge value
in the arriving STATE messages, and once that is confirmed we know we
are safe and can set the mentioned field. Before this moment, we must
ignore all broadcast acknowledges from the peer.
Acked-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

2d18ac4b

sock: ignore SCM_RIGHTS and SCM_CREDENTIALS in __sock_cmsg_send · 779f1ede

由 Soheil Hassas Yeganeh 提交于 7月 11, 2016

Sergei Trofimovich reported that pulse audio sends SCM_CREDENTIALS
as a control message to TCP. Since __sock_cmsg_send does not
support SCM_RIGHTS and SCM_CREDENTIALS, it returns an error and
hence breaks pulse audio over TCP.

SCM_RIGHTS and SCM_CREDENTIALS are sent on the SOL_SOCKET layer
but they semantically belong to SOL_UNIX. Since all
cmsg-processing functions including sock_cmsg_send ignore control
messages of other layers, it is best to ignore SCM_RIGHTS
and SCM_CREDENTIALS for consistency (and also for fixing pulse
audio over TCP).

Fixes: c14ac945 ("sock: enable timestamping using control messages")
Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
Reported-by: NSergei Trofimovich <slyfox@gentoo.org>
Tested-by: NSergei Trofimovich <slyfox@gentoo.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

779f1ede

ipv4: reject RTNH_F_DEAD and RTNH_F_LINKDOWN from user space · 80610229

由 Julian Anastasov 提交于 7月 10, 2016

Vegard Nossum is reporting for a crash in fib_dump_info
when nh_dev = NULL and fib_nhs == 1:

Pid: 50, comm: netlink.exe Not tainted 4.7.0-rc5+
RIP: 0033:[<00000000602b3d18>]
RSP: 0000000062623890  EFLAGS: 00010202
RAX: 0000000000000000 RBX: 000000006261b800 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000024 RDI: 000000006245ba00
RBP: 00000000626238f0 R08: 000000000000029c R09: 0000000000000000
R10: 0000000062468038 R11: 000000006245ba00 R12: 000000006245ba00
R13: 00000000625f96c0 R14: 00000000601e16f0 R15: 0000000000000000
Kernel panic - not syncing: Kernel mode fault at addr 0x2e0, ip 0x602b3d18
CPU: 0 PID: 50 Comm: netlink.exe Not tainted 4.7.0-rc5+ #581
Stack:
 626238f0 960226a02 00000400 000000fe
 62623910 600afca7 62623970 62623a48
 62468038 00000018 00000000 00000000
Call Trace:
 [<602b3e93>] rtmsg_fib+0xd3/0x190
 [<602b6680>] fib_table_insert+0x260/0x500
 [<602b0e5d>] inet_rtm_newroute+0x4d/0x60
 [<60250def>] rtnetlink_rcv_msg+0x8f/0x270
 [<60267079>] netlink_rcv_skb+0xc9/0xe0
 [<60250d4b>] rtnetlink_rcv+0x3b/0x50
 [<60265400>] netlink_unicast+0x1a0/0x2c0
 [<60265e47>] netlink_sendmsg+0x3f7/0x470
 [<6021dc9a>] sock_sendmsg+0x3a/0x90
 [<6021e0d0>] ___sys_sendmsg+0x300/0x360
 [<6021fa64>] __sys_sendmsg+0x54/0xa0
 [<6021fac0>] SyS_sendmsg+0x10/0x20
 [<6001ea68>] handle_syscall+0x88/0x90
 [<600295fd>] userspace+0x3fd/0x500
 [<6001ac55>] fork_handler+0x85/0x90

$ addr2line -e vmlinux -i 0x602b3d18
include/linux/inetdevice.h:222
net/ipv4/fib_semantics.c:1264

Problem happens when RTNH_F_LINKDOWN is provided from user space
when creating routes that do not use the flag, catched with
netlink fuzzer.

Currently, the kernel allows user space to set both flags
to nh_flags and fib_flags but this is not intentional, the
assumption was that they are not set. Fix this by rejecting
both flags with EINVAL.
Reported-by: NVegard Nossum <vegard.nossum@oracle.com>
Fixes: 0eeb075f ("net: ipv4 sysctl option to ignore routes when nexthop link is down")
Signed-off-by: NJulian Anastasov <ja@ssi.bg>
Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
Cc: Dinesh Dutt <ddutt@cumulusnetworks.com>
Cc: Scott Feldman <sfeldma@gmail.com>
Reviewed-by: NAndy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

80610229

tcp: make challenge acks less predictable · 75ff39cc

由 Eric Dumazet 提交于 7月 10, 2016

Yue Cao claims that current host rate limiting of challenge ACKS
(RFC 5961) could leak enough information to allow a patient attacker
to hijack TCP sessions. He will soon provide details in an academic
paper.

This patch increases the default limit from 100 to 1000, and adds
some randomization so that the attacker can no longer hijack
sessions without spending a considerable amount of probes.

Based on initial analysis and patch from Linus.

Note that we also have per socket rate limiting, so it is tempting
to remove the host limit in the future.

v2: randomize the count of challenge acks per second, not the period.

Fixes: 282f23c6 ("tcp: implement RFC 5961 3.2")
Reported-by: NYue Cao <ycao009@ucr.edu>
Signed-off-by: NEric Dumazet <edumazet@google.com>
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Acked-by: NYuchung Cheng <ycheng@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

75ff39cc

udp: prevent bugcheck if filter truncates packet too much · a6127697

由 Michal Kubeček 提交于 7月 08, 2016

If socket filter truncates an udp packet below the length of UDP header
in udpv6_queue_rcv_skb() or udp_queue_rcv_skb(), it will trigger a
BUG_ON in skb_pull_rcsum(). This BUG_ON (and therefore a system crash if
kernel is configured that way) can be easily enforced by an unprivileged
user which was reported as CVE-2016-6162. For a reproducer, see
http://seclists.org/oss-sec/2016/q3/8

Fixes: e6afc8ac ("udp: remove headers from UDP packets before queueing")
Reported-by: NMarco Grassi <marco.gra@gmail.com>
Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
Acked-by: NEric Dumazet <edumazet@google.com>
Acked-by: NWillem de Bruijn <willemb@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a6127697

10 7月, 2016 1 次提交

dccp: avoid deadlock in dccp_v4_ctl_send_reset · 95556a88

由 Eric Dumazet 提交于 7月 08, 2016

In the prep work I did before enabling BH while handling socket backlog,
I missed two points in DCCP :

1) dccp_v4_ctl_send_reset() uses bh_lock_sock(), assuming BH were
blocked. It is not anymore always true.

2) dccp_v4_route_skb() was using __IP_INC_STATS() instead of
  IP_INC_STATS()

A similar fix was done for TCP, in commit 47dcc20a
("ipv4: tcp: ip_send_unicast_reply() is not BH safe")

Fixes: 7309f882 ("dccp: do not assume DCCP code is non preemptible")
Fixes: 5413d1ba ("net: do not block BH while processing socket backlog")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NDmitry Vyukov <dvyukov@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

95556a88

08 7月, 2016 2 次提交

netfilter: nft_ct: fix expiration getter · c8607e02

由 Florian Westphal 提交于 7月 06, 2016

We need to compute timeout.expires - jiffies, not the other way around.
Add a helper, another patch can then later change more places in
conntrack code where we currently open-code this.

Will allow us to only change one place later when we remove per-ct timer.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

c8607e02

ipvs: fix bind to link-local mcast IPv6 address in backup · 3777ed68

由 Quentin Armitage 提交于 6月 16, 2016

When using HEAD from
https://git.kernel.org/cgit/utils/kernel/ipvsadm/ipvsadm.git/,
the command:
ipvsadm --start-daemon backup --mcast-interface eth0.60 \
    --mcast-group ff02::1:81
fails with the error message:
Argument list too long

whereas both:
ipvsadm --start-daemon master --mcast-interface eth0.60 \
    --mcast-group ff02::1:81
and:
ipvsadm --start-daemon backup --mcast-interface eth0.60 \
    --mcast-group 224.0.0.81
are successful.

The error message "Argument list too long" isn't helpful. The error occurs
because an IPv6 address is given in backup mode.

The error is in make_receive_sock() in net/netfilter/ipvs/ip_vs_sync.c,
since it fails to set the interface on the address or the socket before
calling inet6_bind() (via sock->ops->bind), where the test
'if (!sk->sk_bound_dev_if)' failed.

Setting sock->sk->sk_bound_dev_if on the socket before calling
inet6_bind() resolves the issue.

Fixes: d3328817 ("ipvs: add more mcast parameters for the sync daemon")
Signed-off-by: NQuentin Armitage <quentin@armitage.org.uk>
Acked-by: NJulian Anastasov <ja@ssi.bg>
Signed-off-by: NSimon Horman <horms@verge.net.au>

3777ed68

06 7月, 2016 5 次提交

batman-adv: Fix speedy join in gateway client mode · d1fe176c

由 Sven Eckelmann 提交于 6月 12, 2016

Speedy join only works when the received packet is either broadcast or an
4addr unicast packet. Thus packets converted from broadcast to unicast via
the gateway handling code have to be converted to 4addr packets to allow
the receiving gateway server to add the sender address as temporary entry
to the translation table.

Not doing it will make the batman-adv gateway server drop the DHCP response
in many situations because it doesn't yet have the TT entry for the
destination of the DHCP response.

Fixes: 37135173 ("batman-adv: change interface_rx to get orig node")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Acked-by: NAntonio Quartulli <a@unstable.cc>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>

d1fe176c

cfg80211: handle failed skb allocation · 16a910a6

由 Gregory Greenman 提交于 7月 05, 2016

Handle the case when dev_alloc_skb returns NULL.

Cc: stable@vger.kernel.org
Fixes: 2b67f944 ("cfg80211: reuse existing page fragments in A-MSDU rx")
Signed-off-by: NGregory Greenman <gregory.greenman@intel.com>
Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>

16a910a6

nl80211: Move ACL parsing later to avoid a possible memory leak · 6e8ef842

由 Purushottam Kushwaha 提交于 7月 05, 2016

No support for pbss results in a memory leak for the acl_data
(if parse_acl_data succeeds). Fix this by moving the ACL parsing later.

Cc: stable@vger.kernel.org
Fixes: 34d50519 ("cfg80211: basic support for PBSS network type")
Signed-off-by: NPurushottam Kushwaha <pkushwah@qti.qualcomm.com>
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>

6e8ef842

ipv6: Fix mem leak in rt6i_pcpu · 903ce4ab

由 Martin KaFai Lau 提交于 7月 05, 2016

It was first reported and reproduced by Petr (thanks!) in
https://bugzilla.kernel.org/show_bug.cgi?id=119581

free_percpu(rt->rt6i_pcpu) used to always happen in ip6_dst_destroy().

However, after fixing a deadlock bug in
commit 9c7370a1 ("ipv6: Fix a potential deadlock when creating pcpu rt"),
free_percpu() is not called before setting non_pcpu_rt->rt6i_pcpu to NULL.

It is worth to note that rt6i_pcpu is protected by table->tb6_lock.

kmemleak somehow did not report it.  We nailed it down by
observing the pcpu entries in /proc/vmallocinfo (first suggested
by Hannes, thanks!).
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Fixes: 9c7370a1 ("ipv6: Fix a potential deadlock when creating pcpu rt")
Reported-by: NPetr Novopashenniy <pety@rusnet.ru>
Tested-by: NPetr Novopashenniy <pety@rusnet.ru>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Petr Novopashenniy <pety@rusnet.ru>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

903ce4ab

net: fix decnet rtnexthop parsing · ab58298c

由 Vegard Nossum 提交于 7月 05, 2016

dn_fib_count_nhs() could enter an infinite loop if nhp->rtnh_len == 0
(i.e. if userspace passes a malformed netlink message).

Let's use the helpers from net/nexthop.h which take care of all this
stuff. We can do exactly the same as e.g. fib_count_nexthops() and
fib_get_nhs() from net/ipv4/fib_semantics.c.

This fixes the softlockup for me.

Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab58298c

05 7月, 2016 7 次提交

batman-adv: Free last_bonding_candidate on release of orig_node · cbef1e10

由 Sven Eckelmann 提交于 6月 30, 2016

The orig_ifinfo reference counter for last_bonding_candidate in
batadv_orig_node has to be reduced when an originator node is released.
Otherwise the orig_ifinfo is leaked and the reference counter the netdevice
is not reduced correctly.

Fixes: f3b3d901 ("batman-adv: add bonding again")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>

cbef1e10

batman-adv: Fix reference leak in batadv_find_router · 15c2ed75

由 Sven Eckelmann 提交于 6月 30, 2016

The replacement of last_bonding_candidate in batadv_orig_node has to be an
atomic operation. Otherwise it is possible that the reference counter of a
batadv_orig_ifinfo is reduced which was no longer the
last_bonding_candidate when the new candidate is added. This can either
lead to an invalid memory access or to reference leaks which make it
impossible to an interface which was added to batman-adv.

Fixes: f3b3d901 ("batman-adv: add bonding again")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>

15c2ed75

batman-adv: Fix non-atomic bla_claim::backbone_gw access · 3db0decf

由 Sven Eckelmann 提交于 7月 01, 2016

The pointer batadv_bla_claim::backbone_gw can be changed at any time.
Therefore, access to it must be protected to ensure that two function
accessing the same backbone_gw are actually accessing the same. This is
especially important when the crc_lock is used or when the backbone_gw of a
claim is exchanged.

Not doing so leads to invalid memory access and/or reference leaks.

Fixes: 23721387 ("batman-adv: add basic bridge loop avoidance code")
Fixes: 5a1dd8a4 ("batman-adv: lock crc access in bridge loop avoidance")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>

3db0decf

batman-adv: Fix orig_node_vlan leak on orig_node_release · 33fbb1f3

由 Sven Eckelmann 提交于 6月 30, 2016

batadv_orig_node_new uses batadv_orig_node_vlan_new to allocate a new
batadv_orig_node_vlan and add it to batadv_orig_node::vlan_list. References
to this list have also to be cleaned when the batadv_orig_node is removed.

Fixes: 7ea7b4a1 ("batman-adv: make the TT CRC logic VLAN specific")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>

33fbb1f3

batman-adv: Avoid nullptr dereference in dat after vlan_insert_tag · 60154a1e

由 Sven Eckelmann 提交于 7月 02, 2016

vlan_insert_tag can return NULL on errors. The distributed arp table code
therefore has to check the return value of vlan_insert_tag for NULL before
it can safely operate on this pointer.

Fixes: be1db4f6 ("batman-adv: make the Distributed ARP Table vlan aware")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>

60154a1e

batman-adv: Avoid nullptr dereference in bla after vlan_insert_tag · 10c78f58

由 Sven Eckelmann 提交于 7月 02, 2016

vlan_insert_tag can return NULL on errors. The bridge loop avoidance code
therefore has to check the return value of vlan_insert_tag for NULL before
it can safely operate on this pointer.

Fixes: 23721387 ("batman-adv: add basic bridge loop avoidance code")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>

10c78f58

RDS: fix rds_tcp_init() error path · 3dad5424

由 Vegard Nossum 提交于 7月 03, 2016

If register_pernet_subsys() fails, we shouldn't try to call
unregister_pernet_subsys().

Fixes: 467fa153 ("RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns.")
Cc: stable@vger.kernel.org
Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com>
Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3dad5424

02 7月, 2016 3 次提交

tipc: fix nl compat regression for link statistics · 55e77a3e

由 Richard Alpe 提交于 7月 01, 2016

Fix incorrect use of nla_strlcpy() where the first NLA_HDRLEN bytes
of the link name where left out.

Making the output of tipc-config -ls look something like:
Link statistics:
dcast-link
1:data0-1.1.2:data0
1:data0-1.1.3:data0

Also, for the record, the patch that introduce this regression
claims "Sending the whole object out can cause a leak". Which isn't
very likely as this is a compat layer, where the data we are parsing
is generated by us and we know the string to be NULL terminated. But
you can of course never be to secure.

Fixes: 5d2be142 (tipc: fix an infoleak in tipc_nl_compat_link_dump)
Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55e77a3e

net_sched: fix mirrored packets checksum · 82a31b92

由 WANG Cong 提交于 6月 30, 2016

Similar to commit 9b368814 ("net: fix bridge multicast packet checksum validation")
we need to fixup the checksum for CHECKSUM_COMPLETE when
pushing skb on RX path. Otherwise we get similar splats.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82a31b92

packet: Use symmetric hash for PACKET_FANOUT_HASH. · eb70db87

由 David S. Miller 提交于 7月 01, 2016

People who use PACKET_FANOUT_HASH want a symmetric hash, meaning that
they want packets going in both directions on a flow to hash to the
same bucket.

The core kernel SKB hash became non-symmetric when the ipv6 flow label
and other entities were incorporated into the standard flow hash order
to increase entropy.

But there are no users of PACKET_FANOUT_HASH who want an assymetric
hash, they all want a symmetric one.

Therefore, use the flow dissector to compute a flat symmetric hash
over only the protocol, addresses and ports.  This hash does not get
installed into and override the normal skb hash, so this change has
no effect whatsoever on the rest of the stack.
Reported-by: NEric Leblond <eric@regit.org>
Tested-by: NEric Leblond <eric@regit.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb70db87

01 7月, 2016 1 次提交

netfilter: conntrack: avoid integer overflow when resizing · 9cc1c73a

由 Florian Westphal 提交于 4月 24, 2016

Can overflow so we might allocate very small table when bucket count is
high on a 32bit platform.

Note: resize is only possible from init_netns.
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>

9cc1c73a

30 6月, 2016 1 次提交

ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output · fedbb6b4

由 Shmulik Ladkani 提交于 6月 29, 2016

ip_skb_dst_mtu uses skb->sk, assuming it is an AF_INET socket (e.g. it
calls ip_sk_use_pmtu which casts sk as an inet_sk).

However, in the case of UDP tunneling, the skb->sk is not necessarily an
inet socket (could be AF_PACKET socket, or AF_UNSPEC if arriving from
tun/tap).

OTOH, the sk passed as an argument throughout IP stack's output path is
the one which is of PMTU interest:
 - In case of local sockets, sk is same as skb->sk;
 - In case of a udp tunnel, sk is the tunneling socket.

Fix, by passing ip_finish_output's sk to ip_skb_dst_mtu.
This augments 7026b1dd 'netfilter: Pass socket pointer down through okfn().'
Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fedbb6b4

29 6月, 2016 8 次提交

openvswitch: fix conntrack netlink event delivery · d913d3a7

由 Samuel Gauthier 提交于 6月 28, 2016

Only the first and last netlink message for a particular conntrack are
actually sent. The first message is sent through nf_conntrack_confirm when
the conntrack is committed. The last one is sent when the conntrack is
destroyed on timeout. The other conntrack state change messages are not
advertised.

When the conntrack subsystem is used from netfilter, nf_conntrack_confirm
is called for each packet, from the postrouting hook, which in turn calls
nf_ct_deliver_cached_events to send the state change netlink messages.

This commit fixes the problem by calling nf_ct_deliver_cached_events in the
non-commit case as well.

Fixes: 7f8a436e ("openvswitch: Add conntrack action")
CC: Joe Stringer <joestringer@nicira.com>
CC: Justin Pettit <jpettit@nicira.com>
CC: Andy Zhou <azhou@nicira.com>
CC: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NSamuel Gauthier <samuel.gauthier@6wind.com>
Acked-by: NJoe Stringer <joe@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d913d3a7

neigh: Explicitly declare RCU-bh read side critical section in neigh_xmit() · b560f03d

由 David Barroso 提交于 6月 28, 2016

neigh_xmit() expects to be called inside an RCU-bh read side critical
section, and while one of its two current callers gets this right, the
other one doesn't.

More specifically, neigh_xmit() has two callers, mpls_forward() and
mpls_output(), and while both callers call neigh_xmit() under
rcu_read_lock(), this provides sufficient protection for neigh_xmit()
only in the case of mpls_forward(), as that is always called from
softirq context and therefore doesn't need explicit BH protection,
while mpls_output() can be called from process context with softirqs
enabled.

When mpls_output() is called from process context, with softirqs
enabled, we can be preempted by a softirq at any time, and RCU-bh
considers the completion of a softirq as signaling the end of any
pending read-side critical sections, so if we do get a softirq
while we are in the part of neigh_xmit() that expects to be run inside
an RCU-bh read side critical section, we can end up with an unexpected
RCU grace period running right in the middle of that critical section,
making things go boom.

This patch fixes this impedance mismatch in the callee, by making
neigh_xmit() always take rcu_read_{,un}lock_bh() around the code that
expects to be treated as an RCU-bh read side critical section, as this
seems a safer option than fixing it in the callers.

Fixes: 4fd3d7d9 ("neigh: Add helper function neigh_xmit")
Signed-off-by: NDavid Barroso <dbarroso@fastly.com>
Signed-off-by: NLennert Buytenhek <lbuytenhek@fastly.com>
Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
Acked-by: NRobert Shearman <rshearma@brocade.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b560f03d

cfg80211: fix proto in ieee80211_data_to_8023 for frames without LLC header · c041778c

由 Felix Fietkau 提交于 6月 29, 2016

The PDU length of incoming LLC frames is set to the total skb payload size
in __ieee80211_data_to_8023() of net/wireless/util.c which incorrectly
includes the length of the IEEE 802.11 header.

The resulting LLC frame header has a too large PDU length, causing the
llc_fixup_skb() function of net/llc/llc_input.c to reject the incoming
skb, effectively breaking STP.

Solve the problem by properly substracting the IEEE 802.11 frame header size
from the PDU length, allowing the LLC processor to pick up the incoming
control messages.

Special thanks to Gerry Rozema for tracking down the regression and proposing
a suitable patch.

Fixes: 2d1c304c ("cfg80211: add function for 802.3 conversion with separate output buffer")
Cc: stable@vger.kernel.org
Reported-by: NGerry Rozema <gerryr@rozeware.com>
Signed-off-by: NFelix Fietkau <nbd@nbd.name>
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>

c041778c

net: bridge: fix vlan stats continue counter · 565ce8f3

由 Nikolay Aleksandrov 提交于 6月 27, 2016

I made a dumb off-by-one mistake when I added the vlan stats counter
dumping code. The increment should happen before the check, not after
otherwise we miss one entry when we continue dumping.

Fixes: a60c0903 ("bridge: netlink: export per-vlan stats")
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

565ce8f3

tcp: do not send too big packets at retransmit time · a3d2e9f8

由 Eric Dumazet 提交于 6月 27, 2016

Arjun reported a bug in TCP stack and bisected it to a recent commit.

In case where we process SACK, we can coalesce multiple skbs
into fat ones (tcp_shift_skb_data()), to lower write queue
overhead, because we do not expect to retransmit these packets.

However, SACK reneging can happen, forcing the sender to retransmit
all these packets. If skb->len is above 64KB, we then send buggy
IP packets that could hang TSO engine on cxgb4.

Neal suggested to use tcp_tso_autosize() instead of tp->gso_segs
so that we cook packets of optimal size vs TCP/pacing.

Thanks to Arjun for reporting the bug and running the tests !

Fixes: 10d3be56 ("tcp-tso: do not split TSO packets at retransmit time")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NArjun V <arjun@chelsio.com>
Tested-by: NArjun V <arjun@chelsio.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3d2e9f8

batman-adv: Clean up untagged vlan when destroying via rtnl-link · 420cb1b7

由 Sven Eckelmann 提交于 6月 26, 2016

The untagged vlan object is only destroyed when the interface is removed
via the legacy sysfs interface. But it also has to be destroyed when the
standard rtnl-link interface is used.

Fixes: 5d2c05b2 ("batman-adv: add per VLAN interface attribute framework")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Acked-by: NAntonio Quartulli <a@unstable.cc>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

420cb1b7

batman-adv: Fix ICMP RR ethernet access after skb_linearize · 3b55e442

由 Sven Eckelmann 提交于 6月 26, 2016

The skb_linearize may reallocate the skb. This makes the calculated pointer
for ethhdr invalid. But it the pointer is used later to fill in the RR
field of the batadv_icmp_packet_rr packet.

Instead re-evaluate eth_hdr after the skb_linearize+skb_cow to fix the
pointer and avoid the invalid read.

Fixes: da6b8c20 ("batman-adv: generalize batman-adv icmp packet handling")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b55e442

batman-adv: Fix double-put of vlan object · baceced9

由 Ben Hutchings 提交于 6月 26, 2016

Each batadv_tt_local_entry hold a single reference to a
batadv_softif_vlan.  In case a new entry cannot be added to the hash
table, the error path puts the reference, but the reference will also
now be dropped by batadv_tt_local_entry_release().

Fixes: a33d970d ("batman-adv: Fix reference counting of vlan object for tt_local_entry")
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

baceced9

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功