1. 17 8月, 2017 4 次提交
  2. 16 8月, 2017 10 次提交
    • F
      394f51ab
    • F
      e3a22b7f
    • F
      ipv6: route: make rtm_getroute not assume rtnl is locked · 121622db
      Florian Westphal 提交于
      __dev_get_by_index assumes RTNL is held, use _rcu version instead.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      121622db
    • C
      dsa: fix flow disector null pointer · 7324157b
      Craig Gallek 提交于
      A recent change to fix up DSA device behavior made the assumption that
      all skbs passing through the flow disector will be associated with a
      device. This does not appear to be a safe assumption.  Syzkaller found
      the crash below by attaching a BPF socket filter that tries to find the
      payload offset of a packet passing between two unix sockets.
      
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 0 PID: 2940 Comm: syzkaller872007 Not tainted 4.13.0-rc4-next-20170811 #1
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      task: ffff8801d1b425c0 task.stack: ffff8801d0bc0000
      RIP: 0010:__skb_flow_dissect+0xdcd/0x3ae0 net/core/flow_dissector.c:445
      RSP: 0018:ffff8801d0bc7340 EFLAGS: 00010206
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000060 RSI: ffffffff856dc080 RDI: 0000000000000300
      RBP: ffff8801d0bc7870 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000008 R11: ffffed003a178f1e R12: 0000000000000000
      R13: 0000000000000000 R14: ffffffff856dc080 R15: ffff8801ce223140
      FS:  00000000016ed880(0000) GS:ffff8801dc000000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020008000 CR3: 00000001ce22d000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       skb_flow_dissect_flow_keys include/linux/skbuff.h:1176 [inline]
       skb_get_poff+0x9a/0x1a0 net/core/flow_dissector.c:1079
       ______skb_get_pay_offset net/core/filter.c:114 [inline]
       __skb_get_pay_offset+0x15/0x20 net/core/filter.c:112
      Code: 80 3c 02 00 44 89 6d 10 0f 85 44 2b 00 00 4d 8b 67 20 48 b8 00 00 00 00 00 fc ff df 49 8d bc 24 00 03 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 13 2b 00 00 4d 8b a4 24 00 03 00 00 4d 85 e4
      RIP: __skb_flow_dissect+0xdcd/0x3ae0 net/core/flow_dissector.c:445 RSP: ffff8801d0bc7340
      
      Fixes: 43e66528 ("net-next: dsa: fix flow dissection")
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NCraig Gallek <kraig@google.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7324157b
    • K
      net_sched: remove warning from qdisc_hash_add · c90e9514
      Konstantin Khlebnikov 提交于
      It was added in commit e57a784d ("pkt_sched: set root qdisc
      before change() in attach_default_qdiscs()") to hide duplicates
      from "tc qdisc show" for incative deivices.
      
      After 59cc1f61 ("net: sched: convert qdisc linked list to hashtable")
      it triggered when classful qdisc is added to inactive device because
      default qdiscs are added before switching root qdisc.
      
      Anyway after commit ea327469 ("net: sched: avoid duplicates in
      qdisc dump") duplicates are filtered right in dumper.
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c90e9514
    • K
      net_sched/sfq: update hierarchical backlog when drop packet · 325d5dc3
      Konstantin Khlebnikov 提交于
      When sfq_enqueue() drops head packet or packet from another queue it
      have to update backlog at upper qdiscs too.
      
      Fixes: 2ccccf5f ("net_sched: update hierarchical backlog too")
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      325d5dc3
    • K
      net_sched: reset pointers to tcf blocks in classful qdiscs' destructors · 89890422
      Konstantin Khlebnikov 提交于
      Traffic filters could keep direct pointers to classes in classful qdisc,
      thus qdisc destruction first removes all filters before freeing classes.
      Class destruction methods also tries to free attached filters but now
      this isn't safe because tcf_block_put() unlike to tcf_destroy_chain()
      cannot be called second time.
      
      This patch set class->block to NULL after first tcf_block_put() and
      turn second call into no-op.
      
      Fixes: 6529eaba ("net: sched: introduce tcf block infractructure")
      Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      89890422
    • E
      ipv4: fix NULL dereference in free_fib_info_rcu() · 187e5b3a
      Eric Dumazet 提交于
      If fi->fib_metrics could not be allocated in fib_create_info()
      we attempt to dereference a NULL pointer in free_fib_info_rcu() :
      
          m = fi->fib_metrics;
          if (m != &dst_default_metrics && atomic_dec_and_test(&m->refcnt))
                  kfree(m);
      
      Before my recent patch, we used to call kfree(NULL) and nothing wrong
      happened.
      
      Instead of using RCU to defer freeing while we are under memory stress,
      it seems better to take immediate action.
      
      This was reported by syzkaller team.
      
      Fixes: 3fb07daf ("ipv4: add reference counting to metrics")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      187e5b3a
    • E
      ipv6: fix NULL dereference in ip6_route_dev_notify() · 12d94a80
      Eric Dumazet 提交于
      Based on a syzkaller report [1], I found that a per cpu allocation
      failure in snmp6_alloc_dev() would then lead to NULL dereference in
      ip6_route_dev_notify().
      
      It seems this is a very old bug, thus no Fixes tag in this submission.
      
      Let's add in6_dev_put_clear() helper, as we will probably use
      it elsewhere (once available/present in net-next)
      
      [1]
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 1 PID: 17294 Comm: syz-executor6 Not tainted 4.13.0-rc2+ #10
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      task: ffff88019f456680 task.stack: ffff8801c6e58000
      RIP: 0010:__read_once_size include/linux/compiler.h:250 [inline]
      RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline]
      RIP: 0010:refcount_sub_and_test+0x7d/0x1b0 lib/refcount.c:178
      RSP: 0018:ffff8801c6e5f1b0 EFLAGS: 00010202
      RAX: 0000000000000037 RBX: dffffc0000000000 RCX: ffffc90005d25000
      RDX: ffff8801c6e5f218 RSI: ffffffff82342bbf RDI: 0000000000000001
      RBP: ffff8801c6e5f240 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff10038dcbe37
      R13: 0000000000000006 R14: 0000000000000001 R15: 00000000000001b8
      FS:  00007f21e0429700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001ddbc22000 CR3: 00000001d632b000 CR4: 00000000001426e0
      DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
      Call Trace:
       refcount_dec_and_test+0x1a/0x20 lib/refcount.c:211
       in6_dev_put include/net/addrconf.h:335 [inline]
       ip6_route_dev_notify+0x1c9/0x4a0 net/ipv6/route.c:3732
       notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394 [inline]
       raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x51/0x90 net/core/dev.c:1678
       call_netdevice_notifiers net/core/dev.c:1694 [inline]
       rollback_registered_many+0x91c/0xe80 net/core/dev.c:7107
       rollback_registered+0x1be/0x3c0 net/core/dev.c:7149
       register_netdevice+0xbcd/0xee0 net/core/dev.c:7587
       register_netdev+0x1a/0x30 net/core/dev.c:7669
       loopback_net_init+0x76/0x160 drivers/net/loopback.c:214
       ops_init+0x10a/0x570 net/core/net_namespace.c:118
       setup_net+0x313/0x710 net/core/net_namespace.c:294
       copy_net_ns+0x27c/0x580 net/core/net_namespace.c:418
       create_new_namespaces+0x425/0x880 kernel/nsproxy.c:107
       unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:206
       SYSC_unshare kernel/fork.c:2347 [inline]
       SyS_unshare+0x653/0xfa0 kernel/fork.c:2297
       entry_SYSCALL_64_fastpath+0x1f/0xbe
      RIP: 0033:0x4512c9
      RSP: 002b:00007f21e0428c08 EFLAGS: 00000216 ORIG_RAX: 0000000000000110
      RAX: ffffffffffffffda RBX: 0000000000718150 RCX: 00000000004512c9
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000062020200
      RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000216 R12: 00000000004b973d
      R13: 00000000ffffffff R14: 000000002001d000 R15: 00000000000002dd
      Code: 50 2b 34 82 c7 00 f1 f1 f1 f1 c7 40 04 04 f2 f2 f2 c7 40 08 f3 f3
      f3 f3 e8 a1 43 39 ff 4c 89 f8 48 8b 95 70 ff ff ff 48 c1 e8 03 <0f> b6
      0c 18 4c 89 f8 83 e0 07 83 c0 03 38 c8 7c 08 84 c9 0f 85
      RIP: __read_once_size include/linux/compiler.h:250 [inline] RSP:
      ffff8801c6e5f1b0
      RIP: atomic_read arch/x86/include/asm/atomic.h:26 [inline] RSP:
      ffff8801c6e5f1b0
      RIP: refcount_sub_and_test+0x7d/0x1b0 lib/refcount.c:178 RSP:
      ffff8801c6e5f1b0
      ---[ end trace e441d046c6410d31 ]---
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      12d94a80
    • I
      ipv6: fib: Provide offload indication using nexthop flags · fe400799
      Ido Schimmel 提交于
      IPv6 routes currently lack nexthop flags as in IPv4. This has several
      implications.
      
      In the forwarding path, it requires us to check the carrier state of the
      nexthop device and potentially ignore a linkdown route, instead of
      checking for RTNH_F_LINKDOWN.
      
      It also requires capable drivers to use the user facing IPv6-specific
      route flags to provide offload indication, instead of using the nexthop
      flags as in IPv4.
      
      Add nexthop flags to IPv6 routes in the 40 bytes hole and use it to
      provide offload indication instead of the RTF_OFFLOAD flag, which is
      removed while it's still not part of any official kernel release.
      
      In the near future we would like to use the field for the
      RTNH_F_{LINKDOWN,DEAD} flags, but this change is more involved and might
      not be ready in time for the current cycle.
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fe400799
  3. 15 8月, 2017 8 次提交
    • E
      tcp: fix possible deadlock in TCP stack vs BPF filter · d624d276
      Eric Dumazet 提交于
      Filtering the ACK packet was not put at the right place.
      
      At this place, we already allocated a child and put it
      into accept queue.
      
      We absolutely need to call tcp_child_process() to release
      its spinlock, or we will deadlock at accept() or close() time.
      
      Found by syzkaller team (Thanks a lot !)
      
      Fixes: 8fac365f ("tcp: Add a tcp_filter hook before handle ack packet")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: Chenbo Feng <fengc@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d624d276
    • E
      dccp: purge write queue in dccp_destroy_sock() · 7749d4ff
      Eric Dumazet 提交于
      syzkaller reported that DCCP could have a non empty
      write queue at dismantle time.
      
      WARNING: CPU: 1 PID: 2953 at net/core/stream.c:199 sk_stream_kill_queues+0x3ce/0x520 net/core/stream.c:199
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 1 PID: 2953 Comm: syz-executor0 Not tainted 4.13.0-rc4+ #2
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:16 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:52
       panic+0x1e4/0x417 kernel/panic.c:180
       __warn+0x1c4/0x1d9 kernel/panic.c:541
       report_bug+0x211/0x2d0 lib/bug.c:183
       fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190
       do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
       do_trap+0x260/0x390 arch/x86/kernel/traps.c:273
       do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310
       do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323
       invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:846
      RIP: 0010:sk_stream_kill_queues+0x3ce/0x520 net/core/stream.c:199
      RSP: 0018:ffff8801d182f108 EFLAGS: 00010297
      RAX: ffff8801d1144140 RBX: ffff8801d13cb280 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffffffff85137b00 RDI: ffff8801d13cb280
      RBP: ffff8801d182f148 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801d13cb4d0
      R13: ffff8801d13cb3b8 R14: ffff8801d13cb300 R15: ffff8801d13cb3b8
       inet_csk_destroy_sock+0x175/0x3f0 net/ipv4/inet_connection_sock.c:835
       dccp_close+0x84d/0xc10 net/dccp/proto.c:1067
       inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
       sock_release+0x8d/0x1e0 net/socket.c:597
       sock_close+0x16/0x20 net/socket.c:1126
       __fput+0x327/0x7e0 fs/file_table.c:210
       ____fput+0x15/0x20 fs/file_table.c:246
       task_work_run+0x18a/0x260 kernel/task_work.c:116
       exit_task_work include/linux/task_work.h:21 [inline]
       do_exit+0xa32/0x1b10 kernel/exit.c:865
       do_group_exit+0x149/0x400 kernel/exit.c:969
       get_signal+0x7e8/0x17e0 kernel/signal.c:2330
       do_signal+0x94/0x1ee0 arch/x86/kernel/signal.c:808
       exit_to_usermode_loop+0x21c/0x2d0 arch/x86/entry/common.c:157
       prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
       syscall_return_slowpath+0x3a7/0x450 arch/x86/entry/common.c:263
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7749d4ff
    • W
      ipv6: release rt6->rt6i_idev properly during ifdown · e5645f51
      Wei Wang 提交于
      When a dst is created by addrconf_dst_alloc() for a host route or an
      anycast route, dst->dev points to loopback dev while rt6->rt6i_idev
      points to a real device.
      When the real device goes down, the current cleanup code only checks for
      dst->dev and assumes rt6->rt6i_idev->dev is the same. This causes the
      refcount leak on the real device in the above situation.
      This patch makes sure to always release the refcount taken on
      rt6->rt6i_idev during dst_dev_put().
      
      Fixes: 587fea74 ("ipv6: mark DST_NOGC and remove the operation of
      dst_free()")
      Reported-by: NJohn Stultz <john.stultz@linaro.org>
      Tested-by: NJohn Stultz <john.stultz@linaro.org>
      Tested-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e5645f51
    • E
      af_key: do not use GFP_KERNEL in atomic contexts · 36f41f8f
      Eric Dumazet 提交于
      pfkey_broadcast() might be called from non process contexts,
      we can not use GFP_KERNEL in these cases [1].
      
      This patch partially reverts commit ba51b6be ("net: Fix RCU splat in
      af_key"), only keeping the GFP_ATOMIC forcing under rcu_read_lock()
      section.
      
      [1] : syzkaller reported :
      
      in_atomic(): 1, irqs_disabled(): 0, pid: 2932, name: syzkaller183439
      3 locks held by syzkaller183439/2932:
       #0:  (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff83b43888>] pfkey_sendmsg+0x4c8/0x9f0 net/key/af_key.c:3649
       #1:  (&pfk->dump_lock){+.+.+.}, at: [<ffffffff83b467f6>] pfkey_do_dump+0x76/0x3f0 net/key/af_key.c:293
       #2:  (&(&net->xfrm.xfrm_policy_lock)->rlock){+...+.}, at: [<ffffffff83957632>] spin_lock_bh include/linux/spinlock.h:304 [inline]
       #2:  (&(&net->xfrm.xfrm_policy_lock)->rlock){+...+.}, at: [<ffffffff83957632>] xfrm_policy_walk+0x192/0xa30 net/xfrm/xfrm_policy.c:1028
      CPU: 0 PID: 2932 Comm: syzkaller183439 Not tainted 4.13.0-rc4+ #24
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:16 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:52
       ___might_sleep+0x2b2/0x470 kernel/sched/core.c:5994
       __might_sleep+0x95/0x190 kernel/sched/core.c:5947
       slab_pre_alloc_hook mm/slab.h:416 [inline]
       slab_alloc mm/slab.c:3383 [inline]
       kmem_cache_alloc+0x24b/0x6e0 mm/slab.c:3559
       skb_clone+0x1a0/0x400 net/core/skbuff.c:1037
       pfkey_broadcast_one+0x4b2/0x6f0 net/key/af_key.c:207
       pfkey_broadcast+0x4ba/0x770 net/key/af_key.c:281
       dump_sp+0x3d6/0x500 net/key/af_key.c:2685
       xfrm_policy_walk+0x2f1/0xa30 net/xfrm/xfrm_policy.c:1042
       pfkey_dump_sp+0x42/0x50 net/key/af_key.c:2695
       pfkey_do_dump+0xaa/0x3f0 net/key/af_key.c:299
       pfkey_spddump+0x1a0/0x210 net/key/af_key.c:2722
       pfkey_process+0x606/0x710 net/key/af_key.c:2814
       pfkey_sendmsg+0x4d6/0x9f0 net/key/af_key.c:3650
      sock_sendmsg_nosec net/socket.c:633 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:643
       ___sys_sendmsg+0x755/0x890 net/socket.c:2035
       __sys_sendmsg+0xe5/0x210 net/socket.c:2069
       SYSC_sendmsg net/socket.c:2080 [inline]
       SyS_sendmsg+0x2d/0x50 net/socket.c:2076
       entry_SYSCALL_64_fastpath+0x1f/0xbe
      RIP: 0033:0x445d79
      RSP: 002b:00007f32447c1dc8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000445d79
      RDX: 0000000000000000 RSI: 000000002023dfc8 RDI: 0000000000000008
      RBP: 0000000000000086 R08: 00007f32447c2700 R09: 00007f32447c2700
      R10: 00007f32447c2700 R11: 0000000000000202 R12: 0000000000000000
      R13: 00007ffe33edec4f R14: 00007f32447c29c0 R15: 0000000000000000
      
      Fixes: ba51b6be ("net: Fix RCU splat in af_key")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Cc: David Ahern <dsa@cumulusnetworks.com>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36f41f8f
    • S
      tcp: ulp: avoid module refcnt leak in tcp_set_ulp · 539a06ba
      Sabrina Dubroca 提交于
      __tcp_ulp_find_autoload returns tcp_ulp_ops after taking a reference on
      the module. Then, if ->init fails, tcp_set_ulp propagates the error but
      nothing releases that reference.
      
      Fixes: 734942cc ("tcp: ULP infrastructure")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      539a06ba
    • J
      tipc: avoid inheriting msg_non_seq flag when message is returned · 59a361bc
      Jon Paul Maloy 提交于
      In the function msg_reverse(), we reverse the header while trying to
      reuse the original buffer whenever possible. Those rejected/returned
      messages are always transmitted as unicast, but the msg_non_seq field
      is not explicitly set to zero as it should be.
      
      We have seen cases where multicast senders set the message type to
      "NOT dest_droppable", meaning that a multicast message shorter than
      one MTU will be returned, e.g., during receive buffer overflow, by
      reusing the original buffer. This has the effect that even the
      'msg_non_seq' field is inadvertently inherited by the rejected message,
      although it is now sent as a unicast message. This again leads the
      receiving unicast link endpoint to steer the packet toward the broadcast
      link receive function, where it is dropped. The affected unicast link is
      thereafter (after 100 failed retransmissions) declared 'stale' and
      reset.
      
      We fix this by unconditionally setting the 'msg_non_seq' flag to zero
      for all rejected/returned messages.
      Reported-by: NCanh Duc Luu <canh.d.luu@dektech.com.au>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      59a361bc
    • J
      tipc: accept PACKET_MULTICAST packets · fed5f571
      Jon Paul Maloy 提交于
      On L2 bearers, the TIPC broadcast function is sending out packets using
      the corresponding L2 broadcast address. At reception, we filter such
      packets under the assumption that they will also be delivered as
      broadcast packets.
      
      This assumption doesn't always hold true. Under high load, we have seen
      that a switch may convert the destination address and deliver the packet
      as a PACKET_MULTICAST, something leading to inadvertently dropped
      packets and a stale and reset broadcast link.
      
      We fix this by extending the reception filtering to accept packets of
      type PACKET_MULTICAST.
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fed5f571
    • F
      ipv4: route: fix inet_rtm_getroute induced crash · 2c87d63a
      Florian Westphal 提交于
      "ip route get $daddr iif eth0 from $saddr" causes:
       BUG: KASAN: use-after-free in ip_route_input_rcu+0x1535/0x1b50
       Call Trace:
        ip_route_input_rcu+0x1535/0x1b50
        ip_route_input_noref+0xf9/0x190
        tcp_v4_early_demux+0x1a4/0x2b0
        ip_rcv+0xbcb/0xc05
        __netif_receive_skb+0x9c/0xd0
        netif_receive_skb_internal+0x5a8/0x890
      
      Problem is that inet_rtm_getroute calls either ip_route_input_rcu (if an
      iif was provided) or ip_route_output_key_hash_rcu.
      
      But ip_route_input_rcu, unlike ip_route_output_key_hash_rcu, already
      associates the dst_entry with the skb.  This clears the SKB_DST_NOREF
      bit (i.e. skb_dst_drop will release/free the entry while it should not).
      
      Thus only set the dst if we called ip_route_output_key_hash_rcu().
      
      I tested this patch by running:
       while true;do ip r get 10.0.1.2;done > /dev/null &
       while true;do ip r get 10.0.1.2 iif eth0  from 10.0.1.1;done > /dev/null &
      ... and saw no crash or memory leak.
      
      Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
      Cc: David Ahern <dsahern@gmail.com>
      Fixes: ba52d61e ("ipv4: route: restore skb_dst_set in inet_rtm_getroute")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c87d63a
  4. 14 8月, 2017 4 次提交
  5. 12 8月, 2017 14 次提交