提交 · a9efad8b24bd22616f6c749a6c029957dc76542b · openanolis / cloud-kernel

25 5月, 2016 3 次提交

net_sched: avoid too many hrtimer_start() calls · a9efad8b

由 Eric Dumazet 提交于 5月 23, 2016

I found a serious performance bug in packet schedulers using hrtimers.

sch_htb and sch_fq are definitely impacted by this problem.

We constantly rearm high resolution timers if some packets are throttled
in one (or more) class, and other packets are flying through qdisc on
another (non throttled) class.

hrtimer_start() does not have the mod_timer() trick of doing nothing if
expires value does not change :

	if (timer_pending(timer) &&
            timer->expires == expires)
                return 1;

This issue is particularly visible when multiple cpus can queue/dequeue
packets on the same qdisc, as hrtimer code has to lock a remote base.

I used following fix :

1) Change htb to use qdisc_watchdog_schedule_ns() instead of open-coding
it.

2) Cache watchdog prior expiration. hrtimer might provide this, but I
prefer to not rely on some hrtimer internal.
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a9efad8b

ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit path. · 252f3f5a

由 Haishuang Yan 提交于 5月 21, 2016

In gre6 xmit path, we are sending a GRE packet, so set fl6 proto
to IPPROTO_GRE properly.
Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

252f3f5a

ip6_gre: Fix MTU setting for ip6gretap · 1b227e53

由 Haishuang Yan 提交于 5月 21, 2016

When creat an ip6gretap interface with an unreachable route,
the MTU is about 14 bytes larger than what was needed.

If the remote address is reachable:
ping6 2001:0:130::1 -c 2
PING 2001:0:130::1(2001:0:130::1) 56 data bytes
64 bytes from 2001:0:130::1: icmp_seq=1 ttl=64 time=1.46 ms
64 bytes from 2001:0:130::1: icmp_seq=2 ttl=64 time=81.1 ms
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1b227e53

24 5月, 2016 2 次提交

ipv4: Fix non-initialized TTL when CONFIG_SYSCTL=n · 049bbf58

由 Ezequiel Garcia 提交于 5月 20, 2016

Commit fa50d974 ("ipv4: Namespaceify ip_default_ttl sysctl knob")
moves the default TTL assignment, and as side-effect IPv4 TTL now
has a default value only if sysctl support is enabled (CONFIG_SYSCTL=y).

The sysctl_ip_default_ttl is fundamental for IP to work properly,
as it provides the TTL to be used as default. The defautl TTL may be
used in ip_selected_ttl, through the following flow:

  ip_select_ttl
    ip4_dst_hoplimit
      net->ipv4.sysctl_ip_default_ttl

This commit fixes the issue by assigning net->ipv4.sysctl_ip_default_ttl
in net_init_net, called during ipv4's initialization.

Without this commit, a kernel built without sysctl support will send
all IP packets with zero TTL (unless a TTL is explicitly set, e.g.
with setsockopt).

Given a similar issue might appear on the other knobs that were
namespaceify, this commit also moves them.

Fixes: fa50d974 ("ipv4: Namespaceify ip_default_ttl sysctl knob")
Signed-off-by: NEzequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

049bbf58

net/atm: sk_err_soft must be positive · c685293a

由 Stefan Hajnoczi 提交于 5月 18, 2016

The sk_err and sk_err_soft fields are positive errno values and
userspace applications rely on this when using getsockopt(SO_ERROR).

ATM code places an -errno into sk_err_soft in sigd_send() and returns it
from svc_addparty()/svc_dropparty().

Although I am not familiar with ATM code I came to this conclusion
because:

1. sigd_send() msg->type cases as_okay and as_error both have:

   sk->sk_err = -msg->reply;

   while the as_addparty and as_dropparty cases have:

   sk->sk_err_soft = msg->reply;

   This is the source of the inconsistency.

2. svc_addparty() returns an -errno and assumes sk_err_soft is also an
   -errno:

       if (flags & O_NONBLOCK) {
           error = -EINPROGRESS;
           goto out;
       }
       ...
       error = xchg(&sk->sk_err_soft, 0);
   out:
       release_sock(sk);
       return error;

   This shows that sk_err_soft is indeed being treated as an -errno.

This patch ensures that sk_err_soft is always a positive errno.
Signed-off-by: NStefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c685293a

21 5月, 2016 20 次提交

udp: prevent skbs lingering in tunnel socket queues · e5aed006

由 Hannes Frederic Sowa 提交于 5月 19, 2016

In case we find a socket with encapsulation enabled we should call
the encap_recv function even if just a udp header without payload is
available. The callbacks are responsible for correctly verifying and
dropping the packets.

Also, in case the header validation fails for geneve and vxlan we
shouldn't put the skb back into the socket queue, no one will pick
them up there.  Instead we can simply discard them in the respective
encap_recv functions.
Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

e5aed006

ip6_gre: Do not allow segmentation offloads GRE_CSUM is enabled with FOU/GUE · 6a553681

由 Alexander Duyck 提交于 5月 18, 2016

This patch addresses the same issue we had for IPv4 where enabling GRE with
an inner checksum cannot be supported with FOU/GUE due to the fact that
they will jump past the GRE header at it is treated like a tunnel header.
Signed-off-by: NAlexander Duyck <aduyck@mirantis.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6a553681

RDS: TCP: Avoid rds connection churn from rogue SYNs · c948bb5c

由 Sowmini Varadhan 提交于 5月 18, 2016

When a rogue SYN is received after the connection arbitration
algorithm has converged, the incoming SYN should not needlessly
quiesce the transmit path, and it should not result in needless
TCP connection resets due to re-execution of the connection
arbitration logic.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c948bb5c

RDS: TCP: rds_tcp_accept_worker() must exit gracefully when terminating rds-tcp · 37e14f4f

由 Sowmini Varadhan 提交于 5月 18, 2016

There are two instances where we want to terminate RDS-TCP: when
exiting the netns or during module unload. In either case, the
termination sequence is to stop the listen socket, mark the
rtn->rds_tcp_listen_sock as null, and flush any accept workqs.
Thus any workqs that get flushed at this point will encounter a
null rds_tcp_listen_sock, and must exit gracefully to allow
the RDS-TCP termination to complete successfully.
Signed-off-by: NSowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

37e14f4f

ipv6: Don't reset inner headers in ip6_tnl_xmit · 3ee93eaf

由 Tom Herbert 提交于 5月 18, 2016

Since iptunnel_handle_offloads() is called in all paths we can
probably drop the block in ip6_tnl_xmit that was checking for
skb->encapsulation and resetting the inner headers.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3ee93eaf

ip4ip6: Support for GSO/GRO · b8921ca8

由 Tom Herbert 提交于 5月 18, 2016

Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b8921ca8

ip6ip6: Support for GSO/GRO · 815d22e5

由 Tom Herbert 提交于 5月 18, 2016

Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

815d22e5

ipv6: Set features for IPv6 tunnels · 51c052d4

由 Tom Herbert 提交于 5月 18, 2016

Need to set dev features, use same values that are used in GREv6.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

51c052d4

ip6_tunnel: Add support for fou/gue encapsulation · b3a27b51

由 Tom Herbert 提交于 5月 18, 2016

Add netlink and setup for encapsulation
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b3a27b51

ip6_gre: Add support for fou/gue encapsulation · 1faf3d9f

由 Tom Herbert 提交于 5月 18, 2016

Add netlink and setup for encapsulation
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1faf3d9f

fou: Add encap ops for IPv6 tunnels · aa3463d6

由 Tom Herbert 提交于 5月 18, 2016

This patch add a new fou6 module that provides encapsulation
operations for IPv6.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

aa3463d6

ip6_tun: Add infrastructure for doing encapsulation · 058214a4

由 Tom Herbert 提交于 5月 18, 2016

Add encap_hlen and ip_tunnel_encap structure to ip6_tnl. Add functions
for getting encap hlen, setting up encap on a tunnel, performing
encapsulation operation.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

058214a4

fou: Support IPv6 in fou · 5f914b68

由 Tom Herbert 提交于 5月 18, 2016

This patch adds receive path support for IPv6 with fou.

- Add address family to fou structure for open sockets. This supports
  AF_INET and AF_INET6. Lookups for fou ports are performed on both the
  port number and family.
- In fou and gue receive adjust tot_len in IPv4 header or payload_len
  based on address family.
- Allow AF_INET6 in FOU_ATTR_AF netlink attribute.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5f914b68

fou: Split out {fou,gue}_build_header · dc969b81

由 Tom Herbert 提交于 5月 18, 2016

Create __fou_build_header and __gue_build_header. These implement the
protocol generic parts of building the fou and gue header.
fou_build_header and gue_build_header implement the IPv4 specific
functions and call the __*_build_header functions.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc969b81

fou: Call setup_udp_tunnel_sock · 440924bb

由 Tom Herbert 提交于 5月 18, 2016

Use helper function to set up UDP tunnel related information for a fou
socket.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

440924bb

net: Cleanup encap items in ip_tunnels.h · 55c2bc14

由 Tom Herbert 提交于 5月 18, 2016

Consolidate all the ip_tunnel_encap definitions in one spot in the
header file. Also, move ip_encap_hlen and ip_tunnel_encap from
ip_tunnel.c to ip_tunnels.h so they call be called without a dependency
on ip_tunnel module. Similarly, move iptun_encaps to ip_tunnel_core.c.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

55c2bc14

ipv6: Change "final" protocol processing for encapsulation · 1da44f9c

由 Tom Herbert 提交于 5月 18, 2016

When performing foo-over-UDP, UDP packets are processed by the
encapsulation handler which returns another protocol to process.
This may result in processing two (or more) protocols in the
loop that are marked as INET6_PROTO_FINAL. The actions taken
for hitting a final protocol, in particular the skb_postpull_rcsum
can only be performed once.

This patch set adds a check of a final protocol has been seen. The
rules are:
  - If the final protocol has not been seen any protocol is processed
    (final and non-final). In the case of a final protocol, the final
    actions are taken (like the skb_postpull_rcsum)
  - If a final protocol has been seen (e.g. an encapsulating UDP
    header) then no further non-final protocols are allowed
    (e.g. extension headers). For more final protocols the
    final actions are not taken (e.g. skb_postpull_rcsum).
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

1da44f9c

ipv6: Fix nexthdr for reinjection · 4c64242a

由 Tom Herbert 提交于 5月 18, 2016

In ip6_input_finish the nexthdr protocol is retrieved from the
next header offset that is returned in the cb of the skb.
This method does not work for UDP encapsulation that may not
even have a concept of a nexthdr field (e.g. FOU).

This patch checks for a final protocol (INET6_PROTO_FINAL) when a
protocol handler returns > 0. If the protocol is not final then
resubmission is performed on nhoff value. If the protocol is final
then the nexthdr is taken to be the return value.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

4c64242a

net: define gso types for IPx over IPv4 and IPv6 · 7e13318d

由 Tom Herbert 提交于 5月 18, 2016

This patch defines two new GSO definitions SKB_GSO_IPXIP4 and
SKB_GSO_IPXIP6 along with corresponding NETIF_F_GSO_IPXIP4 and
NETIF_F_GSO_IPXIP6. These are used to described IP in IP
tunnel and what the outer protocol is. The inner protocol
can be deduced from other GSO types (e.g. SKB_GSO_TCPV4 and
SKB_GSO_TCPV6). The GSO types of SKB_GSO_IPIP and SKB_GSO_SIT
are removed (these are both instances of SKB_GSO_IPXIP4).
SKB_GSO_IPXIP6 will be used when support for GSO with IP
encapsulation over IPv6 is added.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Acked-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

7e13318d

gso: Remove arbitrary checks for unsupported GSO · 5c7cdf33

由 Tom Herbert 提交于 5月 18, 2016

In several gso_segment functions there are checks of gso_type against
a seemingly arbitrary list of SKB_GSO_* flags. This seems like an
attempt to identify unsupported GSO types, but since the stack is
the one that set these GSO types in the first place this seems
unnecessary to do. If a combination isn't valid in the first
place that stack should not allow setting it.

This is a code simplication especially for add new GSO types.
Signed-off-by: NTom Herbert <tom@herbertland.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

5c7cdf33

20 5月, 2016 5 次提交

mm/page_ref: use page_ref helper instead of direct modification of _count · 6d061f9f

由 Joonsoo Kim 提交于 5月 19, 2016

page_reference manipulation functions are introduced to track down
reference count change of the page.  Use it instead of direct
modification of _count.
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Sunil Goutham <sgoutham@cavium.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6d061f9f

fs: poll/select/recvmmsg: use timespec64 for timeout events · 766b9f92

由 Deepa Dinamani 提交于 5月 19, 2016

struct timespec is not y2038 safe. Even though timespec might be
sufficient to represent timeouts, use struct timespec64 here as the plan
is to get rid of all timespec reference in the kernel.

The patch transitions the common functions: poll_select_set_timeout()
and select_estimate_accuracy() to use timespec64. And, all the syscalls
that use these functions are transitioned in the same patch.

The restart block parameters for poll uses monotonic time. Use
timespec64 here as well to assign timeout value. This parameter in the
restart block need not change because this only holds the monotonic
timestamp at which timeout should occur. And, unsigned long data type
should be big enough for this timestamp.

The system call interfaces will be handled in a separate series.

Compat interfaces need not change as timespec64 is an alias to struct
timespec on a 64 bit system.

Link: http://lkml.kernel.org/r/1461947989-21926-3-git-send-email-deepa.kernel@gmail.comSigned-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
Acked-by: NJohn Stultz <john.stultz@linaro.org>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

766b9f92

tipc: block BH in TCP callbacks · b91083a4

由 Eric Dumazet 提交于 5月 17, 2016

TCP stack can now run from process context.

Use read_lock_bh(&sk->sk_callback_lock) variant to restore previous
assumption.

Fixes: 5413d1ba ("net: do not block BH while processing socket backlog")
Fixes: d41a69f1 ("tcp: make tcp_sendmsg() aware of socket backlog")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b91083a4

rds: tcp: block BH in TCP callbacks · 38036629

由 Eric Dumazet 提交于 5月 17, 2016

TCP stack can now run from process context.

Use read_lock_bh(&sk->sk_callback_lock) variant to restore previous
assumption.

Fixes: 5413d1ba ("net: do not block BH while processing socket backlog")
Fixes: d41a69f1 ("tcp: make tcp_sendmsg() aware of socket backlog")
Signed-off-by: NEric Dumazet <edumazet@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

38036629

kcm: fix a signedness in kcm_splice_read() · f1971a2e

由 WANG Cong 提交于 5月 17, 2016

skb_splice_bits() returns int, kcm_splice_read() returns ssize_t,
both are signed.

We may need another patch to make them all ssize_t, but that
deserves a separated patch.

Fixes: 91687355 ("kcm: Splice support")
Reported-by: NDavid Binderman <linuxdev.baldrick@gmail.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f1971a2e

18 5月, 2016 10 次提交

batman-adv: initialize ELP orig address on secondary interfaces · ebe24cea

由 Marek Lindner 提交于 5月 07, 2016

This fix prevents nodes to wrongly create a 00:00:00:00:00:00 originator
which can potentially interfere with the rest of the neighbor statistics.

Fixes: d6f94d91 ("batman-adv: ELP - adding basic infrastructure")
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <a@unstable.cc>

ebe24cea

batman-adv: Avoid duplicate neigh_node additions · e123705e

由 Linus Lüssing 提交于 1月 07, 2016

Two parallel calls to batadv_neigh_node_new() might race for creating
and adding the same neig_node. Fix this by including the check for any
already existing, identical neigh_node within the spin-lock.

This fixes splats like the following:

[  739.535069] ------------[ cut here ]------------
[  739.535079] WARNING: CPU: 0 PID: 0 at /usr/src/batman-adv/git/batman-adv/net/batman-adv/bat_iv_ogm.c:1004 batadv_iv_ogm_process_per_outif+0xe3f/0xe60 [batman_adv]()
[  739.535092] too many matching neigh_nodes
[  739.535094] Modules linked in: dm_mod tun ip6table_filter ip6table_mangle ip6table_nat nf_nat_ipv6 ip6_tables xt_nat iptable_nat nf_nat_ipv4 nf_nat xt_TCPMSS xt_mark iptable_mangle xt_tcpudp xt_conntrack iptable_filter ip_tables x_tables ip_gre ip_tunnel gre bridge stp llc thermal_sys kvm_intel kvm crct10dif_pclmul crc32_pclmul sha256_ssse3 sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd evdev pcspkr ip6_gre ip6_tunnel tunnel6 batman_adv(O) libcrc32c nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack autofs4 ext4 crc16 mbcache jbd2 xen_netfront xen_blkfront crc32c_intel
[  739.535177] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W  O    4.2.0-0.bpo.1-amd64 #1 Debian 4.2.6-3~bpo8+2
[  739.535186]  0000000000000000 ffffffffa013b050 ffffffff81554521 ffff88007d003c18
[  739.535201]  ffffffff8106fa01 0000000000000000 ffff8800047a087a ffff880079c3a000
[  739.735602]  ffff88007b82bf40 ffff88007bc2d1c0 ffffffff8106fa7a ffffffffa013aa8e
[  739.735624] Call Trace:
[  739.735639]  <IRQ>  [<ffffffff81554521>] ? dump_stack+0x40/0x50
[  739.735677]  [<ffffffff8106fa01>] ? warn_slowpath_common+0x81/0xb0
[  739.735692]  [<ffffffff8106fa7a>] ? warn_slowpath_fmt+0x4a/0x50
[  739.735715]  [<ffffffffa012448f>] ? batadv_iv_ogm_process_per_outif+0xe3f/0xe60 [batman_adv]
[  739.735740]  [<ffffffffa0124813>] ? batadv_iv_ogm_receive+0x363/0x380 [batman_adv]
[  739.735762]  [<ffffffffa0124813>] ? batadv_iv_ogm_receive+0x363/0x380 [batman_adv]
[  739.735783]  [<ffffffff810b0841>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[  739.735804]  [<ffffffffa012cb39>] ? batadv_batman_skb_recv+0xc9/0x110 [batman_adv]
[  739.735825]  [<ffffffff81464891>] ? __netif_receive_skb_core+0x841/0x9a0
[  739.735838]  [<ffffffff810b0841>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[  739.735853]  [<ffffffff81465681>] ? process_backlog+0xa1/0x140
[  739.735864]  [<ffffffff81464f1a>] ? net_rx_action+0x20a/0x320
[  739.735878]  [<ffffffff81073aa7>] ? __do_softirq+0x107/0x270
[  739.735891]  [<ffffffff81073d82>] ? irq_exit+0x92/0xa0
[  739.735905]  [<ffffffff8137e0d1>] ? xen_evtchn_do_upcall+0x31/0x40
[  739.735924]  [<ffffffff8155b8fe>] ? xen_do_hypervisor_callback+0x1e/0x40
[  739.735939]  <EOI>  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[  739.735965]  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[  739.735979]  [<ffffffff8100a39c>] ? xen_safe_halt+0xc/0x20
[  739.735991]  [<ffffffff8101da6c>] ? default_idle+0x1c/0xa0
[  739.736004]  [<ffffffff810abf6b>] ? cpu_startup_entry+0x2eb/0x350
[  739.736019]  [<ffffffff81b2af5e>] ? start_kernel+0x480/0x48b
[  739.736032]  [<ffffffff81b2d116>] ? xen_start_kernel+0x507/0x511
[  739.736048] ---[ end trace c106bb901244bc8c ]---

Fixes: f987ed6e ("batman-adv: protect neighbor list with rcu locks")
Reported-by: NMartin Weinelt <martin@darmstadt.freifunk.net>
Signed-off-by: NLinus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <a@unstable.cc>

e123705e

batman-adv: Fix integer overflow in batadv_iv_ogm_calc_tq · d285f52c

由 Sven Eckelmann 提交于 2月 16, 2016

The undefined behavior sanatizer detected an signed integer overflow in a
setup with near perfect link quality

    UBSAN: Undefined behaviour in net/batman-adv/bat_iv_ogm.c:1246:25
    signed integer overflow:
    8713350 * 255 cannot be represented in type 'int'

The problems happens because the calculation of mixed unsigned and signed
integers resulted in an integer multiplication.

      batadv_ogm_packet::tq (u8 255)
    * tq_own (u8 255)
    * tq_asym_penalty (int 134; max 255)
    * tq_iface_penalty (int 255; max 255)

The tq_iface_penalty, tq_asym_penalty and inv_asym_penalty can just be
changed to unsigned int because they are not expected to become negative.

Fixes: c0398768 ("batman-adv: add WiFi penalty")
Signed-off-by: NSven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <a@unstable.cc>

d285f52c

batman-adv: make sure ELP/OGM orig MAC is updated on address change · 1653f61d

由 Antonio Quartulli 提交于 5月 02, 2016

When the MAC address of the primary interface is changed,
update the originator address in the ELP and OGM skb buffers as
well in order to reflect the change.

Fixes: d6f94d91 ("batman-adv: ELP - adding basic infrastructure")
Reported-by: NMarek Lindner <marek@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <a@unstable.cc>

1653f61d

batman-adv: Fix unexpected free of bcast_own on add_if error · f7dcdf5f

由 Sven Eckelmann 提交于 2月 22, 2016

The function batadv_iv_ogm_orig_add_if allocates new buffers for bcast_own
and bcast_own_sum. It is expected that these buffers are unchanged in case
either bcast_own or bcast_own_sum couldn't be resized.

But the error handling of this function frees the already resized buffer
for bcast_own when the allocation of the new bcast_own_sum buffer failed.
This will lead to an invalid memory access when some code will try to
access bcast_own.

Instead the resized new bcast_own buffer has to be kept. This will not lead
to problems because the size of the buffer was only increased and therefore
no user of the buffer will try to access bytes outside of the new buffer.

Fixes: d0015fdd ("batman-adv: provide orig_node routing API")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <a@unstable.cc>

f7dcdf5f

batman-adv: Fix refcnt leak in batadv_v_neigh_* · 71f9d27d

由 Sven Eckelmann 提交于 5月 06, 2016

The functions batadv_neigh_ifinfo_get increase the reference counter of the
batadv_neigh_ifinfo. These have to be reduced again when the reference is
not used anymore to correctly free the objects.

Fixes: 97869060 ("batman-adv: B.A.T.M.A.N. V - implement neighbor comparison API calls")
Signed-off-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <a@unstable.cc>

71f9d27d

batman-adv: Avoid nullptr derefence in batadv_v_neigh_is_sob · a45e932a

由 Sven Eckelmann 提交于 5月 06, 2016

batadv_neigh_ifinfo_get can return NULL when it cannot find (even when only
temporarily) anymore the neigh_ifinfo in the list neigh->ifinfo_list. This
has to be checked to avoid kernel Oopses when the ifinfo is dereferenced.

This a situation which isn't expected but is already handled by functions
like batadv_v_neigh_cmp. The same kind of warning is therefore used before
the function returns without dereferencing the pointers.

a45e932a

batman-adv: fix skb deref after free · 63d443ef

由 Florian Westphal 提交于 5月 10, 2016

batadv_send_skb_to_orig() calls dev_queue_xmit() so we can't use skb->len.

Fixes: 95332477 ("batman-adv: network coding - buffer unicast packets before forward")
Signed-off-by: NFlorian Westphal <fw@strlen.de>
Reviewed-by: NSven Eckelmann <sven@narfation.org>
Signed-off-by: NMarek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: NAntonio Quartulli <a@unstable.cc>

63d443ef

switchdev: pass pointer to fib_info instead of copy · da4ed551

由 Jiri Pirko 提交于 5月 17, 2016

The problem is that fib_info->nh is [0] so the struct fib_info
allocation size depends on number of nexthops. If we just copy fib_info,
we do not copy the nexthops info and driver accesses memory which is not
ours.

Given the fact that fib4 does not defer operations and therefore it does
not need copy, just pass the pointer down to drivers as it was done
before.

Fixes: 850d0cbc ("switchdev: remove pointers from switchdev objects")
Signed-off-by: NJiri Pirko <jiri@mellanox.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

da4ed551

net_sched: close another race condition in tcf_mirred_release() · dc327f89

由 WANG Cong 提交于 5月 16, 2016

We saw the following extra refcount release on veth device:

  kernel: [7957821.463992] unregister_netdevice: waiting for mesos50284 to become free. Usage count = -1

Since we heavily use mirred action to redirect packets to veth, I think
this is caused by the following race condition:

CPU0:
tcf_mirred_release(): (in RCU callback)
	struct net_device *dev = rcu_dereference_protected(m->tcfm_dev, 1);

CPU1:
mirred_device_event():
        spin_lock_bh(&mirred_list_lock);
        list_for_each_entry(m, &mirred_list, tcfm_list) {
                if (rcu_access_pointer(m->tcfm_dev) == dev) {
                        dev_put(dev);
                        /* Note : no rcu grace period necessary, as
                         * net_device are already rcu protected.
                         */
                        RCU_INIT_POINTER(m->tcfm_dev, NULL);
                }
        }
        spin_unlock_bh(&mirred_list_lock);

CPU0:
tcf_mirred_release():
        spin_lock_bh(&mirred_list_lock);
        list_del(&m->tcfm_list);
        spin_unlock_bh(&mirred_list_lock);
        if (dev)               // <======== Stil refers to the old m->tcfm_dev
                dev_put(dev);  // <======== dev_put() is called on it again

The action init code path is good because it is impossible to modify
an action that is being removed.

So, fix this by moving everything under the spinlock.

Fixes: 2ee22a90 ("net_sched: act_mirred: remove spinlock in fast path")
Fixes: 6bd00b85 ("act_mirred: fix a race condition on mirred_list")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
Acked-by: NJamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

dc327f89

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功