1. 04 11月, 2016 2 次提交
    • E
      tcp: fix potential memory corruption · ac9e70b1
      Eric Dumazet 提交于
      Imagine initial value of max_skb_frags is 17, and last
      skb in write queue has 15 frags.
      
      Then max_skb_frags is lowered to 14 or smaller value.
      
      tcp_sendmsg() will then be allowed to add additional page frags
      and eventually go past MAX_SKB_FRAGS, overflowing struct
      skb_shared_info.
      
      Fixes: 5f74f82e ("net:Add sysctl_max_skb_frags")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
      Cc: Håkon Bugge <haakon.bugge@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ac9e70b1
    • W
      inet: fix sleeping inside inet_wait_for_connect() · 14135f30
      WANG Cong 提交于
      Andrey reported this kernel warning:
      
        WARNING: CPU: 0 PID: 4608 at kernel/sched/core.c:7724
        __might_sleep+0x14c/0x1a0 kernel/sched/core.c:7719
        do not call blocking ops when !TASK_RUNNING; state=1 set at
        [<ffffffff811f5a5c>] prepare_to_wait+0xbc/0x210
        kernel/sched/wait.c:178
        Modules linked in:
        CPU: 0 PID: 4608 Comm: syz-executor Not tainted 4.9.0-rc2+ #320
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
         ffff88006625f7a0 ffffffff81b46914 ffff88006625f818 0000000000000000
         ffffffff84052960 0000000000000000 ffff88006625f7e8 ffffffff81111237
         ffff88006aceac00 ffffffff00001e2c ffffed000cc4beff ffffffff84052960
        Call Trace:
         [<     inline     >] __dump_stack lib/dump_stack.c:15
         [<ffffffff81b46914>] dump_stack+0xb3/0x10f lib/dump_stack.c:51
         [<ffffffff81111237>] __warn+0x1a7/0x1f0 kernel/panic.c:550
         [<ffffffff8111132c>] warn_slowpath_fmt+0xac/0xd0 kernel/panic.c:565
         [<ffffffff811922fc>] __might_sleep+0x14c/0x1a0 kernel/sched/core.c:7719
         [<     inline     >] slab_pre_alloc_hook mm/slab.h:393
         [<     inline     >] slab_alloc_node mm/slub.c:2634
         [<     inline     >] slab_alloc mm/slub.c:2716
         [<ffffffff81508da0>] __kmalloc_track_caller+0x150/0x2a0 mm/slub.c:4240
         [<ffffffff8146be14>] kmemdup+0x24/0x50 mm/util.c:113
         [<ffffffff8388b2cf>] dccp_feat_clone_sp_val.part.5+0x4f/0xe0 net/dccp/feat.c:374
         [<     inline     >] dccp_feat_clone_sp_val net/dccp/feat.c:1141
         [<     inline     >] dccp_feat_change_recv net/dccp/feat.c:1141
         [<ffffffff8388d491>] dccp_feat_parse_options+0xaa1/0x13d0 net/dccp/feat.c:1411
         [<ffffffff83894f01>] dccp_parse_options+0x721/0x1010 net/dccp/options.c:128
         [<ffffffff83891280>] dccp_rcv_state_process+0x200/0x15b0 net/dccp/input.c:644
         [<ffffffff838b8a94>] dccp_v4_do_rcv+0xf4/0x1a0 net/dccp/ipv4.c:681
         [<     inline     >] sk_backlog_rcv ./include/net/sock.h:872
         [<ffffffff82b7ceb6>] __release_sock+0x126/0x3a0 net/core/sock.c:2044
         [<ffffffff82b7d189>] release_sock+0x59/0x1c0 net/core/sock.c:2502
         [<     inline     >] inet_wait_for_connect net/ipv4/af_inet.c:547
         [<ffffffff8316b2a2>] __inet_stream_connect+0x5d2/0xbb0 net/ipv4/af_inet.c:617
         [<ffffffff8316b8d5>] inet_stream_connect+0x55/0xa0 net/ipv4/af_inet.c:656
         [<ffffffff82b705e4>] SYSC_connect+0x244/0x2f0 net/socket.c:1533
         [<ffffffff82b72dd4>] SyS_connect+0x24/0x30 net/socket.c:1514
         [<ffffffff83fbf701>] entry_SYSCALL_64_fastpath+0x1f/0xc2
        arch/x86/entry/entry_64.S:209
      
      Unlike commit 26cabd31
      ("sched, net: Clean up sk_wait_event() vs. might_sleep()"), the
      sleeping function is called before schedule_timeout(), this is indeed
      a bug. Fix this by moving the wait logic to the new API, it is similar
      to commit ff960a73
      ("netdev, sched/wait: Fix sleeping inside wait event").
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      14135f30
  2. 03 11月, 2016 1 次提交
    • E
      ip6_udp_tunnel: remove unused IPCB related codes · 4fd19c15
      Eli Cooper 提交于
      Some IPCB fields are currently set in udp_tunnel6_xmit_skb(), which are
      never used before it reaches ip6tunnel_xmit(), and past that point the
      control buffer is no longer interpreted as IPCB.
      
      This clears these unused IPCB related codes. Currently there is no skb
      scrubbing in ip6_udp_tunnel, otherwise IPCB(skb)->opt might need to be
      cleared for IPv4 packets, as shown in 5146d1f1
      ("tunnel: Clear IPCB(skb)->opt before dst_link_failure called").
      Signed-off-by: NEli Cooper <elicooper@gmx.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4fd19c15
  3. 02 11月, 2016 1 次提交
  4. 01 11月, 2016 9 次提交
  5. 30 10月, 2016 7 次提交
  6. 28 10月, 2016 4 次提交
    • J
      net sched filters: fix notification of filter delete with proper handle · 9ee78374
      Jamal Hadi Salim 提交于
      Daniel says:
      
      While trying out [1][2], I noticed that tc monitor doesn't show the
      correct handle on delete:
      
      $ tc monitor
      qdisc clsact ffff: dev eno1 parent ffff:fff1
      filter dev eno1 ingress protocol all pref 49152 bpf handle 0x2a [...]
      deleted filter dev eno1 ingress protocol all pref 49152 bpf handle 0xf3be0c80
      
      some context to explain the above:
      The user identity of any tc filter is represented by a 32-bit
      identifier encoded in tcm->tcm_handle. Example 0x2a in the bpf filter
      above. A user wishing to delete, get or even modify a specific filter
      uses this handle to reference it.
      Every classifier is free to provide its own semantics for the 32 bit handle.
      Example: classifiers like u32 use schemes like 800:1:801 to describe
      the semantics of their filters represented as hash table, bucket and
      node ids etc.
      Classifiers also have internal per-filter representation which is different
      from this externally visible identity. Most classifiers set this
      internal representation to be a pointer address (which allows fast retrieval
      of said filters in their implementations). This internal representation
      is referenced with the "fh" variable in the kernel control code.
      
      When a user successfuly deletes a specific filter, by specifying the correct
      tcm->tcm_handle, an event is generated to user space which indicates
      which specific filter was deleted.
      
      Before this patch, the "fh" value was sent to user space as the identity.
      As an example what is shown in the sample bpf filter delete event above
      is 0xf3be0c80. This is infact a 32-bit truncation of 0xffff8807f3be0c80
      which happens to be a 64-bit memory address of the internal filter
      representation (address of the corresponding filter's struct cls_bpf_prog);
      
      After this patch the appropriate user identifiable handle as encoded
      in the originating request tcm->tcm_handle is generated in the event.
      One of the cardinal rules of netlink rules is to be able to take an
      event (such as a delete in this case) and reflect it back to the
      kernel and successfully delete the filter. This patch achieves that.
      
      Note, this issue has existed since the original TC action
      infrastructure code patch back in 2004 as found in:
      https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/
      
      [1] http://patchwork.ozlabs.org/patch/682828/
      [2] http://patchwork.ozlabs.org/patch/682829/
      
      Fixes: 4e54c4816bfe ("[NET]: Add tc extensions infrastructure.")
      Reported-by: NDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9ee78374
    • A
      flow_dissector: fix vlan tag handling · bc72f3dd
      Arnd Bergmann 提交于
      gcc warns about an uninitialized pointer dereference in the vlan
      priority handling:
      
      net/core/flow_dissector.c: In function '__skb_flow_dissect':
      net/core/flow_dissector.c:281:61: error: 'vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      As pointed out by Jiri Pirko, the variable is never actually used
      without being initialized first as the only way it end up uninitialized
      is with skb_vlan_tag_present(skb)==true, and that means it does not
      get accessed.
      
      However, the warning hints at some related issues that I'm addressing
      here:
      
      - the second check for the vlan tag is different from the first one
        that tests the skb for being NULL first, causing both the warning
        and a possible NULL pointer dereference that was not entirely fixed.
      - The same patch that introduced the NULL pointer check dropped an
        earlier optimization that skipped the repeated check of the
        protocol type
      - The local '_vlan' variable is referenced through the 'vlan' pointer
        but the variable has gone out of scope by the time that it is
        accessed, causing undefined behavior
      
      Caching the result of the 'skb && skb_vlan_tag_present(skb)' check
      in a local variable allows the compiler to further optimize the
      later check. With those changes, the warning also disappears.
      
      Fixes: 3805a938 ("flow_dissector: Check skb for VLAN only if skb specified.")
      Fixes: d5709f7a ("flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NEric Garver <e@erig.me>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bc72f3dd
    • D
      net: ipv6: Do not consider link state for nexthop validation · d5d32e4b
      David Ahern 提交于
      Similar to IPv4, do not consider link state when validating next hops.
      
      Currently, if the link is down default routes can fail to insert:
       $ ip -6 ro add vrf blue default via 2100:2::64 dev eth2
       RTNETLINK answers: No route to host
      
      With this patch the command succeeds.
      
      Fixes: 8c14586f ("net: ipv6: Use passed in table for nexthop lookups")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d5d32e4b
    • D
      net: ipv6: Fix processing of RAs in presence of VRF · 830218c1
      David Ahern 提交于
      rt6_add_route_info and rt6_add_dflt_router were updated to pull the FIB
      table from the device index, but the corresponding rt6_get_route_info
      and rt6_get_dflt_router functions were not leading to the failure to
      process RA's:
      
          ICMPv6: RA: ndisc_router_discovery failed to add default route
      
      Fix the 'get' functions by using the table id associated with the
      device when applicable.
      
      Also, now that default routes can be added to tables other than the
      default table, rt6_purge_dflt_routers needs to be updated as well to
      look at all tables. To handle that efficiently, add a flag to the table
      denoting if it is has a default route via RA.
      
      Fixes: ca254490 ("net: Add VRF support to IPv6 stack")
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      830218c1
  7. 27 10月, 2016 2 次提交
    • E
      udp: fix IP_CHECKSUM handling · 10df8e61
      Eric Dumazet 提交于
      First bug was added in commit ad6f939a ("ip: Add offset parameter to
      ip_cmsg_recv") : Tom missed that ipv4 udp messages could be received on
      AF_INET6 socket. ip_cmsg_recv(msg, skb) should have been replaced by
      ip_cmsg_recv_offset(msg, skb, sizeof(struct udphdr));
      
      Then commit e6afc8ac ("udp: remove headers from UDP packets before
      queueing") forgot to adjust the offsets now UDP headers are pulled
      before skb are put in receive queue.
      
      Fixes: ad6f939a ("ip: Add offset parameter to ip_cmsg_recv")
      Fixes: e6afc8ac ("udp: remove headers from UDP packets before queueing")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Sam Kumar <samanthakumar@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Tested-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      10df8e61
    • X
      sctp: fix the panic caused by route update · ecc515d7
      Xin Long 提交于
      Commit 7303a147 ("sctp: identify chunks that need to be fragmented
      at IP level") made the chunk be fragmented at IP level in the next round
      if it's size exceed PMTU.
      
      But there still is another case, PMTU can be updated if transport's dst
      expires and transport's pmtu_pending is set in sctp_packet_transmit. If
      the new PMTU is less than the chunk, the same issue with that commit can
      be triggered.
      
      So we should drop this packet and let it retransmit in another round
      where it would be fragmented at IP level.
      
      This patch is to fix it by checking the chunk size after PMTU may be
      updated and dropping this packet if it's size exceed PMTU.
      
      Fixes: 90017acc ("sctp: Add GSO support")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@txudriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ecc515d7
  8. 26 10月, 2016 1 次提交
  9. 24 10月, 2016 2 次提交
    • J
      net: sctp, forbid negative length · a4b8e71b
      Jiri Slaby 提交于
      Most of getsockopt handlers in net/sctp/socket.c check len against
      sizeof some structure like:
              if (len < sizeof(int))
                      return -EINVAL;
      
      On the first look, the check seems to be correct. But since len is int
      and sizeof returns size_t, int gets promoted to unsigned size_t too. So
      the test returns false for negative lengths. Yes, (-1 < sizeof(long)) is
      false.
      
      Fix this in sctp by explicitly checking len < 0 before any getsockopt
      handler is called.
      
      Note that sctp_getsockopt_events already handled the negative case.
      Since we added the < 0 check elsewhere, this one can be removed.
      
      If not checked, this is the result:
      UBSAN: Undefined behaviour in ../mm/page_alloc.c:2722:19
      shift exponent 52 is too large for 32-bit type 'int'
      CPU: 1 PID: 24535 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
       0000000000000000 ffff88006d99f2a8 ffffffffb2f7bdea 0000000041b58ab3
       ffffffffb4363c14 ffffffffb2f7bcde ffff88006d99f2d0 ffff88006d99f270
       0000000000000000 0000000000000000 0000000000000034 ffffffffb5096422
      Call Trace:
       [<ffffffffb3051498>] ? __ubsan_handle_shift_out_of_bounds+0x29c/0x300
      ...
       [<ffffffffb273f0e4>] ? kmalloc_order+0x24/0x90
       [<ffffffffb27416a4>] ? kmalloc_order_trace+0x24/0x220
       [<ffffffffb2819a30>] ? __kmalloc+0x330/0x540
       [<ffffffffc18c25f4>] ? sctp_getsockopt_local_addrs+0x174/0xca0 [sctp]
       [<ffffffffc18d2bcd>] ? sctp_getsockopt+0x10d/0x1b0 [sctp]
       [<ffffffffb37c1219>] ? sock_common_getsockopt+0xb9/0x150
       [<ffffffffb37be2f5>] ? SyS_getsockopt+0x1a5/0x270
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: linux-sctp@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a4b8e71b
    • J
      ipv6: do not increment mac header when it's unset · b678aa57
      Jason A. Donenfeld 提交于
      Otherwise we'll overflow the integer. This occurs when layer 3 tunneled
      packets are handed off to the IPv6 layer.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b678aa57
  10. 23 10月, 2016 2 次提交
  11. 21 10月, 2016 8 次提交
    • W
      ipv6: fix a potential deadlock in do_ipv6_setsockopt() · 8651be8f
      WANG Cong 提交于
      Baozeng reported this deadlock case:
      
             CPU0                    CPU1
             ----                    ----
        lock([  165.136033] sk_lock-AF_INET6);
                                     lock([  165.136033] rtnl_mutex);
                                     lock([  165.136033] sk_lock-AF_INET6);
        lock([  165.136033] rtnl_mutex);
      
      Similar to commit 87e9f031
      ("ipv4: fix a potential deadlock in mcast getsockopt() path")
      this is due to we still have a case, ipv6_sock_mc_close(),
      where we acquire sk_lock before rtnl_lock. Close this deadlock
      with the similar solution, that is always acquire rtnl lock first.
      
      Fixes: baf606d9 ("ipv4,ipv6: grab rtnl before locking the socket")
      Reported-by: NBaozeng Ding <sploving1@gmail.com>
      Tested-by: NBaozeng Ding <sploving1@gmail.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8651be8f
    • L
      batman-adv: fix splat on disabling an interface · 9799c503
      Linus Lüssing 提交于
      As long as there is still a reference for a hard interface held, there might
      still be a forwarding packet relying on its attributes.
      
      Therefore avoid setting hard_iface->soft_iface to NULL when disabling a hard
      interface.
      
      This fixes the following, potential splat:
      
          batman_adv: bat0: Interface deactivated: eth1
          batman_adv: bat0: Removing interface: eth1
          cgroup: new mount options do not match the existing superblock, will be ignored
          batman_adv: bat0: Interface deactivated: eth3
          batman_adv: bat0: Removing interface: eth3
          ------------[ cut here ]------------
          WARNING: CPU: 3 PID: 1986 at ./net/batman-adv/bat_iv_ogm.c:549 batadv_iv_send_outstanding_bat_ogm_packet+0x145/0x643 [batman_adv]
          Modules linked in: batman_adv(O-) <...>
          CPU: 3 PID: 1986 Comm: kworker/u8:2 Tainted: G        W  O    4.6.0-rc6+ #1
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
          Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet [batman_adv]
           0000000000000000 ffff88001d93bca0 ffffffff8126c26b 0000000000000000
           0000000000000000 ffff88001d93bcf0 ffffffff81051615 ffff88001f19f818
           000002251d93bd68 0000000000000046 ffff88001dc04a00 ffff88001becbe48
          Call Trace:
           [<ffffffff8126c26b>] dump_stack+0x67/0x90
           [<ffffffff81051615>] __warn+0xc7/0xe5
           [<ffffffff8105164b>] warn_slowpath_null+0x18/0x1a
           [<ffffffffa0356f24>] batadv_iv_send_outstanding_bat_ogm_packet+0x145/0x643 [batman_adv]
           [<ffffffff8108b01f>] ? __lock_is_held+0x32/0x54
           [<ffffffff810689a2>] process_one_work+0x2a8/0x4f5
           [<ffffffff81068856>] ? process_one_work+0x15c/0x4f5
           [<ffffffff81068df2>] worker_thread+0x1d5/0x2c0
           [<ffffffff81068c1d>] ? process_scheduled_works+0x2e/0x2e
           [<ffffffff81068c1d>] ? process_scheduled_works+0x2e/0x2e
           [<ffffffff8106dd90>] kthread+0xc0/0xc8
           [<ffffffff8144de82>] ret_from_fork+0x22/0x40
           [<ffffffff8106dcd0>] ? __init_kthread_worker+0x55/0x55
          ---[ end trace 647f9f325123dc05 ]---
      
      What happened here is, that there was still a forw_packet (here: a BATMAN IV
      OGM) in the queue of eth3 with the forw_packet->if_incoming set to eth1 and the
      forw_packet->if_outgoing set to eth3.
      
      When eth3 is to be deactivated and removed, then this thread waits for the
      forw_packet queued on eth3 to finish. Because eth1 was deactivated and removed
      earlier and by that had forw_packet->if_incoming->soft_iface, set to NULL, the
      splat when trying to send/flush the OGM on eth3 occures.
      
      Fixes: c6c8fea2 ("net: Add batman-adv meshing protocol")
      Signed-off-by: NLinus Lüssing <linus.luessing@c0d3.blue>
      [sven@narfation.org: Reduced size of Oops message]
      Signed-off-by: NSven Eckelmann <sven@narfation.org>
      Signed-off-by: NSimon Wunderlich <sw@simonwunderlich.de>
      9799c503
    • E
      ipv4: disable BH in set_ping_group_range() · a681574c
      Eric Dumazet 提交于
      In commit 4ee3bd4a ("ipv4: disable BH when changing ip local port
      range") Cong added BH protection in set_local_port_range() but missed
      that same fix was needed in set_ping_group_range()
      
      Fixes: b8f1a556 ("udp: Add function to make source port for UDP tunnels")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NEric Salo <salo@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a681574c
    • E
      udp: must lock the socket in udp_disconnect() · 286c72de
      Eric Dumazet 提交于
      Baozeng Ding reported KASAN traces showing uses after free in
      udp_lib_get_port() and other related UDP functions.
      
      A CONFIG_DEBUG_PAGEALLOC=y kernel would eventually crash.
      
      I could write a reproducer with two threads doing :
      
      static int sock_fd;
      static void *thr1(void *arg)
      {
      	for (;;) {
      		connect(sock_fd, (const struct sockaddr *)arg,
      			sizeof(struct sockaddr_in));
      	}
      }
      
      static void *thr2(void *arg)
      {
      	struct sockaddr_in unspec;
      
      	for (;;) {
      		memset(&unspec, 0, sizeof(unspec));
      	        connect(sock_fd, (const struct sockaddr *)&unspec,
      			sizeof(unspec));
              }
      }
      
      Problem is that udp_disconnect() could run without holding socket lock,
      and this was causing list corruptions.
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NBaozeng Ding <sploving1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      286c72de
    • S
      net: add recursion limit to GRO · fcd91dd4
      Sabrina Dubroca 提交于
      Currently, GRO can do unlimited recursion through the gro_receive
      handlers.  This was fixed for tunneling protocols by limiting tunnel GRO
      to one level with encap_mark, but both VLAN and TEB still have this
      problem.  Thus, the kernel is vulnerable to a stack overflow, if we
      receive a packet composed entirely of VLAN headers.
      
      This patch adds a recursion counter to the GRO layer to prevent stack
      overflow.  When a gro_receive function hits the recursion limit, GRO is
      aborted for this skb and it is processed normally.  This recursion
      counter is put in the GRO CB, but could be turned into a percpu counter
      if we run out of space in the CB.
      
      Thanks to Vladimír Beneš <vbenes@redhat.com> for the initial bug report.
      
      Fixes: CVE-2016-7039
      Fixes: 9b174d88 ("net: Add Transparent Ethernet Bridging GRO support.")
      Fixes: 66e5133f ("vlan: Add GRO support for non hardware accelerated vlan")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: NJiri Benc <jbenc@redhat.com>
      Acked-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: NTom Herbert <tom@herbertland.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fcd91dd4
    • J
      ipv6: properly prevent temp_prefered_lft sysctl race · 7aa8e63f
      Jiri Bohac 提交于
      The check for an underflow of tmp_prefered_lft is always false
      because tmp_prefered_lft is unsigned. The intention of the check
      was to guard against racing with an update of the
      temp_prefered_lft sysctl, potentially resulting in an underflow.
      
      As suggested by David Miller, the best way to prevent the race is
      by reading the sysctl variable using READ_ONCE.
      Signed-off-by: NJiri Bohac <jbohac@suse.cz>
      Reported-by: NJulia Lawall <julia.lawall@lip6.fr>
      Fixes: 76506a98 ("IPv6: fix DESYNC_FACTOR")
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7aa8e63f
    • P
      netfilter: fix nf_queue handling · 7034b566
      Pablo Neira Ayuso 提交于
      nf_queue handling is broken since e3b37f11 ("netfilter: replace
      list_head with single linked list") for two reasons:
      
      1) If the bypass flag is set on, there are no userspace listeners and
         we still have more hook entries to iterate over, then jump to the
         next hook. Otherwise accept the packet. On nf_reinject() path, the
         okfn() needs to be invoked.
      
      2) We should not re-enter the same hook on packet reinjection. If the
         packet is accepted, we have to skip the current hook from where the
         packet was enqueued, otherwise the packets gets enqueued over and
         over again.
      
      This restores the previous list_for_each_entry_continue() behaviour
      happening from nf_iterate() that was dealing with these two cases.
      This patch introduces a new nf_queue() wrapper function so this fix
      becomes simpler.
      
      Fixes: e3b37f11 ("netfilter: replace list_head with single linked list")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7034b566
    • N
      netfilter: conntrack: restart gc immediately if GC_MAX_EVICTS is reached · 7bb6615d
      Nicolas Dichtel 提交于
      When the maximum evictions number is reached, do not wait 5 seconds before
      the next run.
      
      CC: Florian Westphal <fw@strlen.de>
      Signed-off-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      7bb6615d
  12. 20 10月, 2016 1 次提交