1. 15 2月, 2017 2 次提交
  2. 10 2月, 2017 1 次提交
  3. 09 2月, 2017 2 次提交
    • W
      sit: fix a double free on error path · d7426c69
      WANG Cong 提交于
      Dmitry reported a double free in sit_init_net():
      
        kernel BUG at mm/percpu.c:689!
        invalid opcode: 0000 [#1] SMP KASAN
        Dumping ftrace buffer:
           (ftrace buffer empty)
        Modules linked in:
        CPU: 0 PID: 15692 Comm: syz-executor1 Not tainted 4.10.0-rc6-next-20170206 #1
        Hardware name: Google Google Compute Engine/Google Compute Engine,
        BIOS Google 01/01/2011
        task: ffff8801c9cc27c0 task.stack: ffff88017d1d8000
        RIP: 0010:pcpu_free_area+0x68b/0x810 mm/percpu.c:689
        RSP: 0018:ffff88017d1df488 EFLAGS: 00010046
        RAX: 0000000000010000 RBX: 00000000000007c0 RCX: ffffc90002829000
        RDX: 0000000000010000 RSI: ffffffff81940efb RDI: ffff8801db841d94
        RBP: ffff88017d1df590 R08: dffffc0000000000 R09: 1ffffffff0bb3bdd
        R10: dffffc0000000000 R11: 00000000000135dd R12: ffff8801db841d80
        R13: 0000000000038e40 R14: 00000000000007c0 R15: 00000000000007c0
        FS:  00007f6ea608f700(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 000000002000aff8 CR3: 00000001c8d44000 CR4: 00000000001426f0
        DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
        Call Trace:
         free_percpu+0x212/0x520 mm/percpu.c:1264
         ipip6_dev_free+0x43/0x60 net/ipv6/sit.c:1335
         sit_init_net+0x3cb/0xa10 net/ipv6/sit.c:1831
         ops_init+0x10a/0x530 net/core/net_namespace.c:115
         setup_net+0x2ed/0x690 net/core/net_namespace.c:291
         copy_net_ns+0x26c/0x530 net/core/net_namespace.c:396
         create_new_namespaces+0x409/0x860 kernel/nsproxy.c:106
         unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:205
         SYSC_unshare kernel/fork.c:2281 [inline]
         SyS_unshare+0x64e/0xfc0 kernel/fork.c:2231
         entry_SYSCALL_64_fastpath+0x1f/0xc2
      
      This is because when tunnel->dst_cache init fails, we free dev->tstats
      once in ipip6_tunnel_init() and twice in sit_init_net(). This looks
      redundant but its ndo_uinit() does not seem enough to clean up everything
      here. So avoid this by setting dev->tstats to NULL after the first free,
      at least for -net.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7426c69
    • M
      ipv6: addrconf: fix generation of new temporary addresses · a11a7f71
      Marcus Huewe 提交于
      Under some circumstances it is possible that no new temporary addresses
      will be generated.
      
      For instance, addrconf_prefix_rcv_add_addr() indirectly calls
      ipv6_create_tempaddr(), which creates a tentative temporary address and
      starts dad. Next, addrconf_prefix_rcv_add_addr() indirectly calls
      addrconf_verify_rtnl(). Now, assume that the previously created temporary
      address has the least preferred lifetime among all existing addresses and
      is still tentative (that is, dad is still running). Hence, the next run of
      addrconf_verify_rtnl() is performed when the preferred lifetime of the
      temporary address ends. If dad succeeds before the next run, the temporary
      address becomes deprecated during the next run, but no new temporary
      address is generated.
      
      In order to fix this, schedule the next addrconf_verify_rtnl() run slightly
      before the temporary address becomes deprecated, if dad succeeded.
      Signed-off-by: NMarcus Huewe <suse-tux@gmx.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a11a7f71
  4. 08 2月, 2017 1 次提交
  5. 07 2月, 2017 1 次提交
    • L
      ipv6: Fix IPv6 packet loss in scenarios involving roaming + snooping switches · a088d1d7
      Linus Lüssing 提交于
      When for instance a mobile Linux device roams from one access point to
      another with both APs sharing the same broadcast domain and a
      multicast snooping switch in between:
      
      1)    (c) <~~~> (AP1) <--[SSW]--> (AP2)
      
      2)              (AP1) <--[SSW]--> (AP2) <~~~> (c)
      
      Then currently IPv6 multicast packets will get lost for (c) until an
      MLD Querier sends its next query message. The packet loss occurs
      because upon roaming the Linux host so far stayed silent regarding
      MLD and the snooping switch will therefore be unaware of the
      multicast topology change for a while.
      
      This patch fixes this by always resending MLD reports when an interface
      change happens, for instance from NO-CARRIER to CARRIER state.
      Signed-off-by: NLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a088d1d7
  6. 06 2月, 2017 2 次提交
  7. 04 2月, 2017 1 次提交
  8. 02 2月, 2017 1 次提交
  9. 31 1月, 2017 1 次提交
  10. 27 1月, 2017 1 次提交
  11. 25 1月, 2017 3 次提交
  12. 21 1月, 2017 1 次提交
  13. 20 1月, 2017 1 次提交
  14. 19 1月, 2017 1 次提交
    • D
      lwtunnel: fix autoload of lwt modules · 9ed59592
      David Ahern 提交于
      Trying to add an mpls encap route when the MPLS modules are not loaded
      hangs. For example:
      
          CONFIG_MPLS=y
          CONFIG_NET_MPLS_GSO=m
          CONFIG_MPLS_ROUTING=m
          CONFIG_MPLS_IPTUNNEL=m
      
          $ ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
      
      The ip command hangs:
      root       880   826  0 21:25 pts/0    00:00:00 ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
      
          $ cat /proc/880/stack
          [<ffffffff81065a9b>] call_usermodehelper_exec+0xd6/0x134
          [<ffffffff81065efc>] __request_module+0x27b/0x30a
          [<ffffffff814542f6>] lwtunnel_build_state+0xe4/0x178
          [<ffffffff814aa1e4>] fib_create_info+0x47f/0xdd4
          [<ffffffff814ae451>] fib_table_insert+0x90/0x41f
          [<ffffffff814a8010>] inet_rtm_newroute+0x4b/0x52
          ...
      
      modprobe is trying to load rtnl-lwt-MPLS:
      
      root       881     5  0 21:25 ?        00:00:00 /sbin/modprobe -q -- rtnl-lwt-MPLS
      
      and it hangs after loading mpls_router:
      
          $ cat /proc/881/stack
          [<ffffffff81441537>] rtnl_lock+0x12/0x14
          [<ffffffff8142ca2a>] register_netdevice_notifier+0x16/0x179
          [<ffffffffa0033025>] mpls_init+0x25/0x1000 [mpls_router]
          [<ffffffff81000471>] do_one_initcall+0x8e/0x13f
          [<ffffffff81119961>] do_init_module+0x5a/0x1e5
          [<ffffffff810bd070>] load_module+0x13bd/0x17d6
          ...
      
      The problem is that lwtunnel_build_state is called with rtnl lock
      held preventing mpls_init from registering.
      
      Given the potential references held by the time lwtunnel_build_state it
      can not drop the rtnl lock to the load module. So, extract the module
      loading code from lwtunnel_build_state into a new function to validate
      the encap type. The new function is called while converting the user
      request into a fib_config which is well before any table, device or
      fib entries are examined.
      
      Fixes: 745041e2 ("lwtunnel: autoload of lwt modules")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ed59592
  15. 17 1月, 2017 2 次提交
    • J
      ip6_tunnel: Account for tunnel header in tunnel MTU · 02ca0423
      Jakub Sitnicki 提交于
      With ip6gre we have a tunnel header which also makes the tunnel MTU
      smaller. We need to reserve room for it. Previously we were using up
      space reserved for the Tunnel Encapsulation Limit option
      header (RFC 2473).
      
      Also, after commit b05229f4 ("gre6: Cleanup GREv6 transmit path,
      call common GRE functions") our contract with the caller has
      changed. Now we check if the packet length exceeds the tunnel MTU after
      the tunnel header has been pushed, unlike before.
      
      This is reflected in the check where we look at the packet length minus
      the size of the tunnel header, which is already accounted for in tunnel
      MTU.
      
      Fixes: b05229f4 ("gre6: Cleanup GREv6 transmit path, call common GRE functions")
      Signed-off-by: NJakub Sitnicki <jkbs@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      02ca0423
    • H
      mld: do not remove mld souce list info when set link down · 1666d49e
      Hangbin Liu 提交于
      This is an IPv6 version of commit 24803f38 ("igmp: do not remove igmp
      souce list..."). In mld_del_delrec(), we will restore back all source filter
      info instead of flush them.
      
      Move mld_clear_delrec() from ipv6_mc_down() to ipv6_mc_destroy_dev() since
      we should not remove source list info when set link down. Remove
      igmp6_group_dropped() in ipv6_mc_destroy_dev() since we have called it in
      ipv6_mc_down().
      
      Also clear all source info after igmp6_group_dropped() instead of in it
      because ipv6_mc_down() will call igmp6_group_dropped().
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1666d49e
  16. 16 1月, 2017 1 次提交
    • L
      netfilter: rpfilter: fix incorrect loopback packet judgment · 6443ebc3
      Liping Zhang 提交于
      Currently, we check the existing rtable in PREROUTING hook, if RTCF_LOCAL
      is set, we assume that the packet is loopback.
      
      But this assumption is incorrect, for example, a packet encapsulated
      in ipsec transport mode was received and routed to local, after
      decapsulation, it would be delivered to local again, and the rtable
      was not dropped, so RTCF_LOCAL check would trigger. But actually, the
      packet was not loopback.
      
      So for these normal loopback packets, we can check whether the in device
      is IFF_LOOPBACK or not. For these locally generated broadcast/multicast,
      we can check whether the skb->pkt_type is PACKET_LOOPBACK or not.
      
      Finally, there's a subtle difference between nft fib expr and xtables
      rpfilter extension, user can add the following nft rule to do strict
      rpfilter check:
        # nft add rule x y meta iif eth0 fib saddr . iif oif != eth0 drop
      
      So when the packet is loopback, it's better to store the in device
      instead of the LOOPBACK_IFINDEX, otherwise, after adding the above
      nft rule, locally generated broad/multicast packets will be dropped
      incorrectly.
      
      Fixes: f83a7ea2 ("netfilter: xt_rpfilter: skip locally generated broadcast/multicast, too")
      Fixes: f6d0cbcf ("netfilter: nf_tables: add fib expression")
      Signed-off-by: NLiping Zhang <zlpnobody@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      6443ebc3
  17. 14 1月, 2017 1 次提交
    • D
      ipv6: sr: fix several BUGs when preemption is enabled · fa79581e
      David Lebrun 提交于
      When CONFIG_PREEMPT=y, CONFIG_IPV6=m and CONFIG_SEG6_HMAC=y,
      seg6_hmac_init() is called during the initialization of the ipv6 module.
      This causes a subsequent call to smp_processor_id() with preemption
      enabled, resulting in the following trace.
      
      [   20.451460] BUG: using smp_processor_id() in preemptible [00000000] code: systemd/1
      [   20.452556] caller is debug_smp_processor_id+0x17/0x19
      [   20.453304] CPU: 0 PID: 1 Comm: systemd Not tainted 4.9.0-rc5-00973-g46738b13 #1
      [   20.454406]  ffffc9000062fc18 ffffffff813607b2 0000000000000000 ffffffff81a7f782
      [   20.455528]  ffffc9000062fc48 ffffffff813778dc 0000000000000000 00000000001dcf98
      [   20.456539]  ffffffffa003bd08 ffffffff81af93e0 ffffc9000062fc58 ffffffff81377905
      [   20.456539] Call Trace:
      [   20.456539]  [<ffffffff813607b2>] dump_stack+0x63/0x7f
      [   20.456539]  [<ffffffff813778dc>] check_preemption_disabled+0xd1/0xe3
      [   20.456539]  [<ffffffff81377905>] debug_smp_processor_id+0x17/0x19
      [   20.460260]  [<ffffffffa0061f3b>] seg6_hmac_init+0xfa/0x192 [ipv6]
      [   20.460260]  [<ffffffffa0061ccc>] seg6_init+0x39/0x6f [ipv6]
      [   20.460260]  [<ffffffffa006121a>] inet6_init+0x21a/0x321 [ipv6]
      [   20.460260]  [<ffffffffa0061000>] ? 0xffffffffa0061000
      [   20.460260]  [<ffffffff81000457>] do_one_initcall+0x8b/0x115
      [   20.460260]  [<ffffffff811328a3>] do_init_module+0x53/0x1c4
      [   20.460260]  [<ffffffff8110650a>] load_module+0x1153/0x14ec
      [   20.460260]  [<ffffffff81106a7b>] SYSC_finit_module+0x8c/0xb9
      [   20.460260]  [<ffffffff81106a7b>] ? SYSC_finit_module+0x8c/0xb9
      [   20.460260]  [<ffffffff81106abc>] SyS_finit_module+0x9/0xb
      [   20.460260]  [<ffffffff810014d1>] do_syscall_64+0x62/0x75
      [   20.460260]  [<ffffffff816834f0>] entry_SYSCALL64_slow_path+0x25/0x25
      
      Moreover, dst_cache_* functions also call smp_processor_id(), generating
      a similar trace.
      
      This patch uses raw_cpu_ptr() in seg6_hmac_init() rather than this_cpu_ptr()
      and disable preemption when using dst_cache_* functions.
      Signed-off-by: NDavid Lebrun <david.lebrun@uclouvain.be>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fa79581e
  18. 13 1月, 2017 1 次提交
  19. 11 1月, 2017 1 次提交
  20. 10 1月, 2017 3 次提交
  21. 07 1月, 2017 1 次提交
  22. 30 12月, 2016 1 次提交
    • Z
      ipv6: Should use consistent conditional judgement for ip6 fragment between... · e4c5e13a
      Zheng Li 提交于
      ipv6: Should use consistent conditional judgement for ip6 fragment between __ip6_append_data and ip6_finish_output
      
      There is an inconsistent conditional judgement between __ip6_append_data
      and ip6_finish_output functions, the variable length in __ip6_append_data
      just include the length of application's payload and udp6 header, don't
      include the length of ipv6 header, but in ip6_finish_output use
      (skb->len > ip6_skb_dst_mtu(skb)) as judgement, and skb->len include the
      length of ipv6 header.
      
      That causes some particular application's udp6 payloads whose length are
      between (MTU - IPv6 Header) and MTU were fragmented by ip6_fragment even
      though the rst->dev support UFO feature.
      
      Add the length of ipv6 header to length in __ip6_append_data to keep
      consistent conditional judgement as ip6_finish_output for ip6 fragment.
      Signed-off-by: NZheng Li <james.z.li@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e4c5e13a
  23. 26 12月, 2016 2 次提交
    • T
      ktime: Get rid of ktime_equal() · 1f3a8e49
      Thomas Gleixner 提交于
      No point in going through loops and hoops instead of just comparing the
      values.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      1f3a8e49
    • T
      ktime: Get rid of the union · 2456e855
      Thomas Gleixner 提交于
      ktime is a union because the initial implementation stored the time in
      scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
      variant for 32bit machines. The Y2038 cleanup removed the timespec variant
      and switched everything to scalar nanoseconds. The union remained, but
      become completely pointless.
      
      Get rid of the union and just keep ktime_t as simple typedef of type s64.
      
      The conversion was done with coccinelle and some manual mopping up.
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      2456e855
  24. 25 12月, 2016 1 次提交
  25. 24 12月, 2016 2 次提交
    • D
      ipv6: handle -EFAULT from skb_copy_bits · a98f9175
      Dave Jones 提交于
      By setting certain socket options on ipv6 raw sockets, we can confuse the
      length calculation in rawv6_push_pending_frames triggering a BUG_ON.
      
      RIP: 0010:[<ffffffff817c6390>] [<ffffffff817c6390>] rawv6_sendmsg+0xc30/0xc40
      RSP: 0018:ffff881f6c4a7c18  EFLAGS: 00010282
      RAX: 00000000fffffff2 RBX: ffff881f6c681680 RCX: 0000000000000002
      RDX: ffff881f6c4a7cf8 RSI: 0000000000000030 RDI: ffff881fed0f6a00
      RBP: ffff881f6c4a7da8 R08: 0000000000000000 R09: 0000000000000009
      R10: ffff881fed0f6a00 R11: 0000000000000009 R12: 0000000000000030
      R13: ffff881fed0f6a00 R14: ffff881fee39ba00 R15: ffff881fefa93a80
      
      Call Trace:
       [<ffffffff8118ba23>] ? unmap_page_range+0x693/0x830
       [<ffffffff81772697>] inet_sendmsg+0x67/0xa0
       [<ffffffff816d93f8>] sock_sendmsg+0x38/0x50
       [<ffffffff816d982f>] SYSC_sendto+0xef/0x170
       [<ffffffff816da27e>] SyS_sendto+0xe/0x10
       [<ffffffff81002910>] do_syscall_64+0x50/0xa0
       [<ffffffff817f7cbc>] entry_SYSCALL64_slow_path+0x25/0x25
      
      Handle by jumping to the failure path if skb_copy_bits gets an EFAULT.
      
      Reproducer:
      
      #include <stdio.h>
      #include <stdlib.h>
      #include <string.h>
      #include <unistd.h>
      #include <sys/types.h>
      #include <sys/socket.h>
      #include <netinet/in.h>
      
      #define LEN 504
      
      int main(int argc, char* argv[])
      {
      	int fd;
      	int zero = 0;
      	char buf[LEN];
      
      	memset(buf, 0, LEN);
      
      	fd = socket(AF_INET6, SOCK_RAW, 7);
      
      	setsockopt(fd, SOL_IPV6, IPV6_CHECKSUM, &zero, 4);
      	setsockopt(fd, SOL_IPV6, IPV6_DSTOPTS, &buf, LEN);
      
      	sendto(fd, buf, 1, 0, (struct sockaddr *) buf, 110);
      }
      Signed-off-by: NDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a98f9175
    • W
      inet: fix IP(V6)_RECVORIGDSTADDR for udp sockets · 39b2dd76
      Willem de Bruijn 提交于
      Socket cmsg IP(V6)_RECVORIGDSTADDR checks that port range lies within
      the packet. For sockets that have transport headers pulled, transport
      offset can be negative. Use signed comparison to avoid overflow.
      
      Fixes: e6afc8ac ("udp: remove headers from UDP packets before queueing")
      Reported-by: NNisar Jagabar <njagabar@cloudmark.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39b2dd76
  26. 18 12月, 2016 2 次提交
    • M
      net: ipv6: check route protocol when deleting routes · c2ed1880
      Mantas M 提交于
      The protocol field is checked when deleting IPv4 routes, but ignored for
      IPv6, which causes problems with routing daemons accidentally deleting
      externally set routes (observed by multiple bird6 users).
      
      This can be verified using `ip -6 route del <prefix> proto something`.
      Signed-off-by: NMantas Mikulėnas <grawity@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c2ed1880
    • T
      inet: Fix get port to handle zero port number with soreuseport set · 0643ee4f
      Tom Herbert 提交于
      A user may call listen with binding an explicit port with the intent
      that the kernel will assign an available port to the socket. In this
      case inet_csk_get_port does a port scan. For such sockets, the user may
      also set soreuseport with the intent a creating more sockets for the
      port that is selected. The problem is that the initial socket being
      opened could inadvertently choose an existing and unreleated port
      number that was already created with soreuseport.
      
      This patch adds a boolean parameter to inet_bind_conflict that indicates
      rather soreuseport is allowed for the check (in addition to
      sk->sk_reuseport). In calls to inet_bind_conflict from inet_csk_get_port
      the argument is set to true if an explicit port is being looked up (snum
      argument is nonzero), and is false if port scan is done.
      Signed-off-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0643ee4f
  27. 07 12月, 2016 3 次提交