提交 · a473018cfe0ef1e46c0ff9df3fa02afc23c9f1d2 · openeuler / Kernel

12 7月, 2016 9 次提交

xprtrdma: Remove rpcrdma_map_one() and friends · a473018c

由 Chuck Lever 提交于 6月 29, 2016

Clean up: ALLPHYSICAL is gone and FMR has been converted to use
scatterlists. There are no more users of these functions.

This patch shrinks the size of struct rpcrdma_req by about 3500
bytes on x86_64. There is one of these structs for each RPC credit
(128 credits per transport connection).
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

a473018c

xprtrdma: Remove ALLPHYSICAL memory registration mode · 2dc3a69d

由 Chuck Lever 提交于 6月 29, 2016

No HCA or RNIC in the kernel tree requires the use of ALLPHYSICAL.

ALLPHYSICAL advertises in the clear on the network fabric an R_key
that is good for all of the client's memory. No known exploit
exists, but theoretically any user on the server can use that R_key
on the client's QP to read or update any part of the client's memory.

ALLPHYSICAL exposes the client to server bugs, including:
 o base/bounds errors causing data outside the i/o buffer to be
   accessed
 o RDMA access after reply causing data corruption and/or integrity
   fail

ALLPHYSICAL can't protect application memory regions from server
update after a local signal or soft timeout has terminated an RPC.

ALLPHYSICAL chunks are no larger than a page. Special cases to
handle small chunks and long chunk lists have been a source of
implementation complexity and bugs.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

2dc3a69d

xprtrdma: Do not leak an MW during a DMA map failure · 42fe28f6

由 Chuck Lever 提交于 6月 29, 2016

Based on code audit.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

42fe28f6

xprtrdma: Refactor MR recovery work queues · 505bbe64

由 Chuck Lever 提交于 6月 29, 2016

I found that commit ead3f26e ("xprtrdma: Add ro_unmap_safe
memreg method"), which introduces ro_unmap_safe, never wired up the
FMR recovery worker.

The FMR and FRWR recovery work queues both do the same thing.
Instead of setting up separate individual work queues for this,
schedule a delayed worker to deal with them, since recovering MRs is
not performance-critical.

Fixes: ead3f26e ("xprtrdma: Add ro_unmap_safe memreg method")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

505bbe64

xprtrdma: Use scatterlist for DMA mapping and unmapping under FMR · fcdfb968

由 Chuck Lever 提交于 6月 29, 2016

The use of a scatterlist for handling DMA mapping and unmapping
was recently introduced in frwr_ops.c in commit 4143f34e
("xprtrdma: Port to new memory registration API"). That commit did
not make a similar update to xprtrdma's FMR support because the
core ib_map_phys_fmr() and ib_unmap_fmr() APIs have not been changed
to take a scatterlist argument.

However, FMR still needs to do DMA mapping and unmapping. It appears
that RDS, for example, uses a scatterlist for this, then builds the
DMA addr array for the ib_map_phys_fmr call separately. I see that
SRP also utilizes a scatterlist for DMA mapping. xprtrdma can do
something similar.

This modernization is used immediately to properly defer DMA
unmapping during fmr_unmap_safe (a FIXME). It separates the DMA
unmapping coordinates from the rl_segments array. This array, being
part of an rpcrdma_req, is always re-used immediately when an RPC
exits. A scatterlist is allocated in memory independent of the
rl_segments array, so it can be preserved indefinitely (ie, until
the MR invalidation and DMA unmapping can actually be done by a
worker thread).

The FRWR and FMR DMA mapping code are slightly different from each
other now, and will diverge further when the "Check for holes" logic
can be removed from FRWR (support for SG_GAP MRs). So I chose not to
create helpers for the common-looking code.

Fixes: ead3f26e ("xprtrdma: Add ro_unmap_safe memreg method")
Suggested-by: NSagi Grimberg <sagi@lightbits.io>
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

fcdfb968

xprtrdma: Rename fields in rpcrdma_fmr · 88975ebe

由 Chuck Lever 提交于 6月 29, 2016

Clean up: Use the same naming convention used in other
RPC/RDMA-related data structures.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

88975ebe

xprtrdma: Move init and release helpers · d48b1d29

由 Chuck Lever 提交于 6月 29, 2016

Clean up: Moving these helpers in a separate patch makes later
patches more readable.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

d48b1d29

xprtrdma: Create common scatterlist fields in rpcrdma_mw · 564471d2

由 Chuck Lever 提交于 6月 29, 2016

Clean up: FMR is about to replace the rpcrdma_map_one code with
scatterlists. Move the scatterlist fields out of the FRWR-specific
union and into the generic part of rpcrdma_mw.

One minor change: -EIO is now returned if FRWR registration fails.
The RPC is terminated immediately, since the problem is likely due
to a software bug, thus retrying likely won't help.
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

564471d2

xprtrdma: Remove FMRs from the unmap list after unmapping · 38f1932e

由 Chuck Lever 提交于 6月 29, 2016

ib_unmap_fmr() takes a list of FMRs to unmap. However, it does not
remove the FMRs from this list as it processes them. Other
ib_unmap_fmr() call sites are careful to remove FMRs from the list
after ib_unmap_fmr() returns.

Since commit 7c7a5390 ("xprtrdma: Add ro_unmap_sync method for FMR")
fmr_op_unmap_sync passes more than one FMR to ib_unmap_fmr(), but
it didn't bother to remove the FMRs from that list once the call was
complete.

I've noticed some instability that could be related to list
tangling by the new fmr_op_unmap_sync() logic. In an abundance
of caution, add some defensive logic to clean up properly after
ib_unmap_fmr().

Fixes: 7c7a5390 ("xprtrdma: Add ro_unmap_sync method for FMR")
Signed-off-by: NChuck Lever <chuck.lever@oracle.com>
Tested-by: NSteve Wise <swise@opengridcomputing.com>
Signed-off-by: NAnna Schumaker <Anna.Schumaker@Netapp.com>

38f1932e

06 7月, 2016 2 次提交

ipv6: Fix mem leak in rt6i_pcpu · 903ce4ab

由 Martin KaFai Lau 提交于 7月 05, 2016

It was first reported and reproduced by Petr (thanks!) in
https://bugzilla.kernel.org/show_bug.cgi?id=119581

free_percpu(rt->rt6i_pcpu) used to always happen in ip6_dst_destroy().

However, after fixing a deadlock bug in
commit 9c7370a1 ("ipv6: Fix a potential deadlock when creating pcpu rt"),
free_percpu() is not called before setting non_pcpu_rt->rt6i_pcpu to NULL.

It is worth to note that rt6i_pcpu is protected by table->tb6_lock.

kmemleak somehow did not report it.  We nailed it down by
observing the pcpu entries in /proc/vmallocinfo (first suggested
by Hannes, thanks!).
Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
Fixes: 9c7370a1 ("ipv6: Fix a potential deadlock when creating pcpu rt")
Reported-by: NPetr Novopashenniy <pety@rusnet.ru>
Tested-by: NPetr Novopashenniy <pety@rusnet.ru>
Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Petr Novopashenniy <pety@rusnet.ru>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

903ce4ab

net: fix decnet rtnexthop parsing · ab58298c

由 Vegard Nossum 提交于 7月 05, 2016

dn_fib_count_nhs() could enter an infinite loop if nhp->rtnh_len == 0
(i.e. if userspace passes a malformed netlink message).

Let's use the helpers from net/nexthop.h which take care of all this
stuff. We can do exactly the same as e.g. fib_count_nexthops() and
fib_get_nhs() from net/ipv4/fib_semantics.c.

This fixes the softlockup for me.

Cc: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

ab58298c

05 7月, 2016 1 次提交

RDS: fix rds_tcp_init() error path · 3dad5424

由 Vegard Nossum 提交于 7月 03, 2016

If register_pernet_subsys() fails, we shouldn't try to call
unregister_pernet_subsys().

Fixes: 467fa153 ("RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns.")
Cc: stable@vger.kernel.org
Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: NVegard Nossum <vegard.nossum@oracle.com>
Acked-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3dad5424

02 7月, 2016 3 次提交

tipc: fix nl compat regression for link statistics · 55e77a3e

由 Richard Alpe 提交于 7月 01, 2016

Fix incorrect use of nla_strlcpy() where the first NLA_HDRLEN bytes
of the link name where left out.

Making the output of tipc-config -ls look something like:
Link statistics:
dcast-link
1:data0-1.1.2:data0
1:data0-1.1.3:data0

Also, for the record, the patch that introduce this regression
claims "Sending the whole object out can cause a leak". Which isn't
very likely as this is a compat layer, where the data we are parsing
is generated by us and we know the string to be NULL terminated. But
you can of course never be to secure.

Fixes: 5d2be142 (tipc: fix an infoleak in tipc_nl_compat_link_dump)
Signed-off-by: NRichard Alpe <richard.alpe@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55e77a3e

net_sched: fix mirrored packets checksum · 82a31b92

由 WANG Cong 提交于 6月 30, 2016

Similar to commit 9b368814 ("net: fix bridge multicast packet checksum validation")
we need to fixup the checksum for CHECKSUM_COMPLETE when
pushing skb on RX path. Otherwise we get similar splats.

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

82a31b92

packet: Use symmetric hash for PACKET_FANOUT_HASH. · eb70db87

由 David S. Miller 提交于 7月 01, 2016

People who use PACKET_FANOUT_HASH want a symmetric hash, meaning that
they want packets going in both directions on a flow to hash to the
same bucket.

The core kernel SKB hash became non-symmetric when the ipv6 flow label
and other entities were incorporated into the standard flow hash order
to increase entropy.

But there are no users of PACKET_FANOUT_HASH who want an assymetric
hash, they all want a symmetric one.

Therefore, use the flow dissector to compute a flat symmetric hash
over only the protocol, addresses and ports.  This hash does not get
installed into and override the normal skb hash, so this change has
no effect whatsoever on the rest of the stack.
Reported-by: NEric Leblond <eric@regit.org>
Tested-by: NEric Leblond <eric@regit.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

eb70db87

30 6月, 2016 1 次提交

ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output · fedbb6b4

由 Shmulik Ladkani 提交于 6月 29, 2016

ip_skb_dst_mtu uses skb->sk, assuming it is an AF_INET socket (e.g. it
calls ip_sk_use_pmtu which casts sk as an inet_sk).

However, in the case of UDP tunneling, the skb->sk is not necessarily an
inet socket (could be AF_PACKET socket, or AF_UNSPEC if arriving from
tun/tap).

OTOH, the sk passed as an argument throughout IP stack's output path is
the one which is of PMTU interest:
 - In case of local sockets, sk is same as skb->sk;
 - In case of a udp tunnel, sk is the tunneling socket.

Fix, by passing ip_finish_output's sk to ip_skb_dst_mtu.
This augments 7026b1dd 'netfilter: Pass socket pointer down through okfn().'
Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
Reviewed-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

fedbb6b4

29 6月, 2016 10 次提交

openvswitch: fix conntrack netlink event delivery · d913d3a7

由 Samuel Gauthier 提交于 6月 28, 2016

Only the first and last netlink message for a particular conntrack are
actually sent. The first message is sent through nf_conntrack_confirm when
the conntrack is committed. The last one is sent when the conntrack is
destroyed on timeout. The other conntrack state change messages are not
advertised.

When the conntrack subsystem is used from netfilter, nf_conntrack_confirm
is called for each packet, from the postrouting hook, which in turn calls
nf_ct_deliver_cached_events to send the state change netlink messages.

This commit fixes the problem by calling nf_ct_deliver_cached_events in the
non-commit case as well.

Fixes: 7f8a436e ("openvswitch: Add conntrack action")
CC: Joe Stringer <joestringer@nicira.com>
CC: Justin Pettit <jpettit@nicira.com>
CC: Andy Zhou <azhou@nicira.com>
CC: Thomas Graf <tgraf@suug.ch>
Signed-off-by: NSamuel Gauthier <samuel.gauthier@6wind.com>
Acked-by: NJoe Stringer <joe@ovn.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d913d3a7

neigh: Explicitly declare RCU-bh read side critical section in neigh_xmit() · b560f03d

由 David Barroso 提交于 6月 28, 2016

neigh_xmit() expects to be called inside an RCU-bh read side critical
section, and while one of its two current callers gets this right, the
other one doesn't.

More specifically, neigh_xmit() has two callers, mpls_forward() and
mpls_output(), and while both callers call neigh_xmit() under
rcu_read_lock(), this provides sufficient protection for neigh_xmit()
only in the case of mpls_forward(), as that is always called from
softirq context and therefore doesn't need explicit BH protection,
while mpls_output() can be called from process context with softirqs
enabled.

When mpls_output() is called from process context, with softirqs
enabled, we can be preempted by a softirq at any time, and RCU-bh
considers the completion of a softirq as signaling the end of any
pending read-side critical sections, so if we do get a softirq
while we are in the part of neigh_xmit() that expects to be run inside
an RCU-bh read side critical section, we can end up with an unexpected
RCU grace period running right in the middle of that critical section,
making things go boom.

This patch fixes this impedance mismatch in the callee, by making
neigh_xmit() always take rcu_read_{,un}lock_bh() around the code that
expects to be treated as an RCU-bh read side critical section, as this
seems a safer option than fixing it in the callers.

Fixes: 4fd3d7d9 ("neigh: Add helper function neigh_xmit")
Signed-off-by: NDavid Barroso <dbarroso@fastly.com>
Signed-off-by: NLennert Buytenhek <lbuytenhek@fastly.com>
Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
Acked-by: NRobert Shearman <rshearma@brocade.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b560f03d

cfg80211: fix proto in ieee80211_data_to_8023 for frames without LLC header · c041778c

由 Felix Fietkau 提交于 6月 29, 2016

The PDU length of incoming LLC frames is set to the total skb payload size
in __ieee80211_data_to_8023() of net/wireless/util.c which incorrectly
includes the length of the IEEE 802.11 header.

The resulting LLC frame header has a too large PDU length, causing the
llc_fixup_skb() function of net/llc/llc_input.c to reject the incoming
skb, effectively breaking STP.

Solve the problem by properly substracting the IEEE 802.11 frame header size
from the PDU length, allowing the LLC processor to pick up the incoming
control messages.

Special thanks to Gerry Rozema for tracking down the regression and proposing
a suitable patch.

Fixes: 2d1c304c ("cfg80211: add function for 802.3 conversion with separate output buffer")
Cc: stable@vger.kernel.org
Reported-by: NGerry Rozema <gerryr@rozeware.com>
Signed-off-by: NFelix Fietkau <nbd@nbd.name>
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>

c041778c

net: bridge: fix vlan stats continue counter · 565ce8f3

由 Nikolay Aleksandrov 提交于 6月 27, 2016

I made a dumb off-by-one mistake when I added the vlan stats counter
dumping code. The increment should happen before the check, not after
otherwise we miss one entry when we continue dumping.

Fixes: a60c0903 ("bridge: netlink: export per-vlan stats")
Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

565ce8f3

tcp: do not send too big packets at retransmit time · a3d2e9f8

由 Eric Dumazet 提交于 6月 27, 2016

Arjun reported a bug in TCP stack and bisected it to a recent commit.

In case where we process SACK, we can coalesce multiple skbs
into fat ones (tcp_shift_skb_data()), to lower write queue
overhead, because we do not expect to retransmit these packets.

However, SACK reneging can happen, forcing the sender to retransmit
all these packets. If skb->len is above 64KB, we then send buggy
IP packets that could hang TSO engine on cxgb4.

Neal suggested to use tcp_tso_autosize() instead of tp->gso_segs
so that we cook packets of optimal size vs TCP/pacing.

Thanks to Arjun for reporting the bug and running the tests !

Fixes: 10d3be56 ("tcp-tso: do not split TSO packets at retransmit time")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Reported-by: NArjun V <arjun@chelsio.com>
Tested-by: NArjun V <arjun@chelsio.com>
Acked-by: NNeal Cardwell <ncardwell@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3d2e9f8

batman-adv: Clean up untagged vlan when destroying via rtnl-link · 420cb1b7

由 Sven Eckelmann 提交于 6月 26, 2016

The untagged vlan object is only destroyed when the interface is removed
via the legacy sysfs interface. But it also has to be destroyed when the
standard rtnl-link interface is used.

Fixes: 5d2c05b2 ("batman-adv: add per VLAN interface attribute framework")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Acked-by: NAntonio Quartulli <a@unstable.cc>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

420cb1b7

batman-adv: Fix ICMP RR ethernet access after skb_linearize · 3b55e442

由 Sven Eckelmann 提交于 6月 26, 2016

The skb_linearize may reallocate the skb. This makes the calculated pointer
for ethhdr invalid. But it the pointer is used later to fill in the RR
field of the batadv_icmp_packet_rr packet.

Instead re-evaluate eth_hdr after the skb_linearize+skb_cow to fix the
pointer and avoid the invalid read.

Fixes: da6b8c20 ("batman-adv: generalize batman-adv icmp packet handling")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3b55e442

batman-adv: Fix double-put of vlan object · baceced9

由 Ben Hutchings 提交于 6月 26, 2016

Each batadv_tt_local_entry hold a single reference to a
batadv_softif_vlan.  In case a new entry cannot be added to the hash
table, the error path puts the reference, but the reference will also
now be dropped by batadv_tt_local_entry_release().

Fixes: a33d970d ("batman-adv: Fix reference counting of vlan object for tt_local_entry")
Signed-off-by: NBen Hutchings <ben@decadent.org.uk>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

baceced9

batman-adv: Fix use-after-free/double-free of tt_req_node · 9c4604a2

由 Sven Eckelmann 提交于 6月 26, 2016

The tt_req_node is added and removed from a list inside a spinlock. But the
locking is sometimes removed even when the object is still referenced and
will be used later via this reference. For example batadv_send_tt_request
can create a new tt_req_node (including add to a list) and later
re-acquires the lock to remove it from the list and to free it. But at this
time another context could have already removed this tt_req_node from the
list and freed it.

CPU#0

    batadv_batman_skb_recv from net_device 0
    -> batadv_iv_ogm_receive
      -> batadv_iv_ogm_process
        -> batadv_iv_ogm_process_per_outif
          -> batadv_tvlv_ogm_receive
            -> batadv_tvlv_ogm_receive
              -> batadv_tvlv_containers_process
                -> batadv_tvlv_call_handler
                  -> batadv_tt_tvlv_ogm_handler_v1
                    -> batadv_tt_update_orig
                      -> batadv_send_tt_request
                        -> batadv_tt_req_node_new
                           spin_lock(...)
                           allocates new tt_req_node and adds it to list
                           spin_unlock(...)
                           return tt_req_node

CPU#1

    batadv_batman_skb_recv from net_device 1
    -> batadv_recv_unicast_tvlv
      -> batadv_tvlv_containers_process
        -> batadv_tvlv_call_handler
          -> batadv_tt_tvlv_unicast_handler_v1
            -> batadv_handle_tt_response
               spin_lock(...)
               tt_req_node gets removed from list and is freed
               spin_unlock(...)

CPU#0

                      <- returned to batadv_send_tt_request
                         spin_lock(...)
                         tt_req_node gets removed from list and is freed
                         MEMORY CORRUPTION/SEGFAULT/...
                         spin_unlock(...)

This can only be solved via reference counting to allow multiple contexts
to handle the list manipulation while making sure that only the last
context holding a reference will free the object.

Fixes: a73105b8 ("batman-adv: improved client announcement mechanism")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Tested-by: NMartin Weinelt <martin@darmstadt.freifunk.net>
Tested-by: NAmadeus Alfa <amadeus@chemnitz.freifunk.net>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

9c4604a2

batman-adv: replace WARN with rate limited output on non-existing VLAN · 0b3dd7df

由 Simon Wunderlich 提交于 6月 26, 2016

If a VLAN tagged frame is received and the corresponding VLAN is not
configured on the soft interface, it will splat a WARN on every packet
received. This is a quite annoying behaviour for some scenarios, e.g. if
bat0 is bridged with eth0, and there are arbitrary VLAN tagged frames
from Ethernet coming in without having any VLAN configuration on bat0.

The code should probably create vlan objects on the fly and
transparently transport these VLAN-tagged Ethernet frames, but until
this is done, at least the WARN splat should be replaced by a rate
limited output.

Fixes: 354136bc ("batman-adv: fix kernel crash due to missing NULL checks")
Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0b3dd7df

28 6月, 2016 3 次提交

Bridge: Fix ipv6 mc snooping if bridge has no ipv6 address · 0888d5f3

由 daniel 提交于 6月 24, 2016

The bridge is falsly dropping ipv6 mulitcast packets if there is:
 1. No ipv6 address assigned on the brigde.
 2. No external mld querier present.
 3. The internal querier enabled.

When the bridge fails to build mld queries, because it has no
ipv6 address, it slilently returns, but keeps the local querier enabled.
This specific case causes confusing packet loss.

Ipv6 multicast snooping can only work if:
 a) An external querier is present
 OR
 b) The bridge has an ipv6 address an is capable of sending own queries

Otherwise it has to forward/flood the ipv6 multicast traffic,
because snooping cannot work.

This patch fixes the issue by adding a flag to the bridge struct that
indicates that there is currently no ipv6 address assinged to the bridge
and returns a false state for the local querier in
__br_multicast_querier_exists().

Special thanks to Linus Lüssing.

Fixes: d1d81d4c ("bridge: check return value of ipv6_dev_get_saddr()")
Signed-off-by: NDaniel Danzberger <daniel@dd-wrt.com>
Acked-by: NLinus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

0888d5f3

mac80211: Fix mesh estab_plinks counting in STA removal case · 126e7557

由 Jouni Malinen 提交于 6月 19, 2016

If a user space program (e.g., wpa_supplicant) deletes a STA entry that
is currently in NL80211_PLINK_ESTAB state, the number of established
plinks counter was not decremented and this could result in rejecting
new plink establishment before really hitting the real maximum plink
limit. For !user_mpm case, this decrementation is handled by
mesh_plink_deactive().

Fix this by decrementing estab_plinks on STA deletion
(mesh_sta_cleanup() gets called from there) so that the counter has a
correct value and the Beacon frame advertisement in Mesh Configuration
element shows the proper value for capability to accept additional
peers.

Cc: stable@vger.kernel.org
Signed-off-by: NJouni Malinen <j@w1.fi>
Signed-off-by: NJohannes Berg <johannes@sipsolutions.net>

126e7557

ipmr/ip6mr: Initialize the last assert time of mfc entries. · 70a0dec4

由 Tom Goff 提交于 6月 23, 2016

This fixes wrong-interface signaling on 32-bit platforms for entries
created when jiffies > 2^31 + MFC_ASSERT_THRESH.
Signed-off-by: NTom Goff <thomas.goff@ll.mit.edu>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

70a0dec4

27 6月, 2016 2 次提交

vsock: make listener child lock ordering explicit · 4192f672

由 Stefan Hajnoczi 提交于 6月 23, 2016

There are several places where the listener and pending or accept queue
child sockets are accessed at the same time.  Lockdep is unhappy that
two locks from the same class are held.

Tell lockdep that it is safe and document the lock ordering.

Originally Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> sent a similar
patch asking whether this is safe.  I have audited the code and also
covered the vsock_pending_work() function.
Suggested-by: NClaudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4192f672

ipv6: enforce egress device match in per table nexthop lookups · 48f1dcb5

由 Paolo Abeni 提交于 6月 23, 2016

with the commit 8c14586f ("net: ipv6: Use passed in table for
nexthop lookups"), net hop lookup is first performed on route creation
in the passed-in table.
However device match is not enforced in table lookup, so the found
route can be later discarded due to egress device mismatch and no
global lookup will be performed.
This cause the following to fail:

ip link add dummy1 type dummy
ip link add dummy2 type dummy
ip link set dummy1 up
ip link set dummy2 up
ip route add 2001:db8:8086::/48 dev dummy1 metric 20
ip route add 2001:db8:d34d::/64 via 2001:db8:8086::2 dev dummy1 metric 20
ip route add 2001:db8:8086::/48 dev dummy2 metric 21
ip route add 2001:db8:d34d::/64 via 2001:db8:8086::2 dev dummy2 metric 21
RTNETLINK answers: No route to host

This change fixes the issue enforcing device lookup in
ip6_nh_lookup_table()

v1->v2: updated commit message title

Fixes: 8c14586f ("net: ipv6: Use passed in table for nexthop lookups")
Reported-and-tested-by: NBeniamino Galvani <bgalvani@redhat.com>
Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

48f1dcb5

24 6月, 2016 3 次提交

netem: fix a use after free · 21de12ee

由 Eric Dumazet 提交于 6月 20, 2016

If the packet was dropped by lower qdisc, then we must not
access it later.

Save qdisc_pkt_len(skb) in a temp variable.

Fixes: 2ccccf5f ("net_sched: update hierarchical backlog too")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

21de12ee

act_ife: acquire ife_mod_lock before reading ifeoplist · 817e9f2c

由 WANG Cong 提交于 6月 20, 2016

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

817e9f2c

act_ife: only acquire tcf_lock for existing actions · 067a7cd0

由 WANG Cong 提交于 6月 20, 2016

Alexey reported that we have GFP_KERNEL allocation when
holding the spinlock tcf_lock. Actually we don't have
to take that spinlock for all the cases, especially
for the new one we just create. To modify the existing
actions, we still need this spinlock to make sure
the whole update is atomic.

For net-next, we can get rid of this spinlock because
we already hold the RTNL lock on slow path, and on fast
path we can use RCU to protect the metalist.

Joint work with Jamal.
Reported-by: NAlexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

067a7cd0

23 6月, 2016 3 次提交

esp: Fix ESN generation under UDP encapsulation · 962fcef3

由 Herbert Xu 提交于 6月 18, 2016

Blair Steven noticed that ESN in conjunction with UDP encapsulation
is broken because we set the temporary ESP header to the wrong spot.

This patch fixes this by first of all using the right spot, i.e.,
4 bytes off the real ESP header, and then saving this information
so that after encryption we can restore it properly.

Fixes: 7021b2e1 ("esp4: Switch to new AEAD interface")
Reported-by: NBlair Steven <Blair.Steven@alliedtelesis.co.nz>
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Acked-by: NSteffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

962fcef3

tipc: unclone unbundled buffers before forwarding · 27777daa

由 Jon Paul Maloy 提交于 6月 20, 2016

When extracting an individual message from a received "bundle" buffer,
we just create a clone of the base buffer, and adjust it to point into
the right position of the linearized data area of the latter. This works
well for regular message reception, but during periods of extremely high
load it may happen that an extracted buffer, e.g, a connection probe, is
reversed and forwarded through an external interface while the preceding
extracted message is still unhandled. When this happens, the header or
data area of the preceding message will be partially overwritten by a
MAC header, leading to unpredicatable consequences, such as a link
reset.

We now fix this by ensuring that the msg_reverse() function never
returns a cloned buffer, and that the returned buffer always contains
sufficient valid head and tail room to be forwarded.
Reported-by: NErik Hugne <erik.hugne@gmail.com>
Acked-by: NYing Xue <ying.xue@windriver.com>
Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

27777daa

kcm: fix /proc memory leak · d19af0a7

由 Jiri Slaby 提交于 6月 20, 2016

Every open of /proc/net/kcm leaks 16 bytes of memory as is reported by
kmemleak:
unreferenced object 0xffff88059c0e3458 (size 192):
  comm "cat", pid 1401, jiffies 4294935742 (age 310.720s)
  hex dump (first 32 bytes):
    28 45 71 96 05 88 ff ff 00 10 00 00 00 00 00 00  (Eq.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8156a2de>] kmem_cache_alloc_trace+0x16e/0x230
    [<ffffffff8162a479>] seq_open+0x79/0x1d0
    [<ffffffffa0578510>] kcm_seq_open+0x0/0x30 [kcm]
    [<ffffffff8162a479>] seq_open+0x79/0x1d0
    [<ffffffff8162a8cf>] __seq_open_private+0x2f/0xa0
    [<ffffffff81712548>] seq_open_net+0x38/0xa0
...

It is caused by a missing free in the ->release path. So fix it by
providing seq_release_net as the ->release method.
Signed-off-by: NJiri Slaby <jslaby@suse.cz>
Fixes: cd6e111b (kcm: Add statistics and proc interfaces)
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Tom Herbert <tom@herbertland.com>
Cc: netdev@vger.kernel.org
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

d19af0a7

19 6月, 2016 2 次提交

net: rds: fix coding style issues · 5c3da57d

由 Joshua Houghton 提交于 6月 18, 2016

Fix coding style issues in the following files:

ib_cm.c:      add space
loop.c:       convert spaces to tabs
sysctl.c:     add space
tcp.h:        convert spaces to tabs
tcp_connect.c:remove extra indentation in switch statement
tcp_recv.c:   convert spaces to tabs
tcp_send.c:   convert spaces to tabs
transport.c:  move brace up one line on for statement
Signed-off-by: NJoshua Houghton <josh@awful.name>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c3da57d

AX.25: Close socket connection on session completion · 4a7d99ea

由 Basil Gunn 提交于 6月 16, 2016

A socket connection made in ax.25 is not closed when session is
completed.  The heartbeat timer is stopped prematurely and this is
where the socket gets closed. Allow heatbeat timer to run to close
socket. Symptom occurs in kernels >= 4.2.0

Originally sent 6/15/2016. Resend with distribution list matching
scripts/maintainer.pl output.
Signed-off-by: NBasil Gunn <basil@pacabunga.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4a7d99ea

18 6月, 2016 1 次提交

RDS: TCP: rds_tcp_accept_one() should transition socket from RESETTING to UP · 3bb549ae

由 Sowmini Varadhan 提交于 6月 17, 2016

The state of the rds_connection after rds_tcp_reset_callbacks() would
be RDS_CONN_RESETTING and this is the value that should be passed
by rds_tcp_accept_one()  to rds_connect_path_complete() to transition
the socket to RDS_CONN_UP.

Fixes: b5c21c0947c1 ("RDS: TCP: fix race windows in send-path quiescence
by rds_tcp_accept_one()")
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3bb549ae

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功