1. 02 3月, 2017 12 次提交
    • E
      ipv6: orphan skbs in reassembly unit · 48cac18e
      Eric Dumazet 提交于
      Andrey reported a use-after-free in IPv6 stack.
      
      Issue here is that we free the socket while it still has skb
      in TX path and in some queues.
      
      It happens here because IPv6 reassembly unit messes skb->truesize,
      breaking skb_set_owner_w() badly.
      
      We fixed a similar issue for IPV4 in commit 8282f274 ("inet: frag:
      Always orphan skbs inside ip_defrag()")
      Acked-by: NJoe Stringer <joe@ovn.org>
      
      ==================================================================
      BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
      Read of size 8 at addr ffff880062da0060 by task a.out/4140
      
      page:ffffea00018b6800 count:1 mapcount:0 mapping:          (null)
      index:0x0 compound_mapcount: 0
      flags: 0x100000000008100(slab|head)
      raw: 0100000000008100 0000000000000000 0000000000000000 0000000180130013
      raw: dead000000000100 dead000000000200 ffff88006741f140 0000000000000000
      page dumped because: kasan: bad access detected
      
      CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:15
       dump_stack+0x292/0x398 lib/dump_stack.c:51
       describe_address mm/kasan/report.c:262
       kasan_report_error+0x121/0x560 mm/kasan/report.c:370
       kasan_report mm/kasan/report.c:392
       __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
       sock_flag ./arch/x86/include/asm/bitops.h:324
       sock_wfree+0x118/0x120 net/core/sock.c:1631
       skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
       skb_release_all+0x15/0x60 net/core/skbuff.c:668
       __kfree_skb+0x15/0x20 net/core/skbuff.c:684
       kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
       inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
       inet_frag_put ./include/net/inet_frag.h:133
       nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617
       ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
       nf_hook_entry_hookfn ./include/linux/netfilter.h:102
       nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
       nf_hook ./include/linux/netfilter.h:212
       __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
       ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
       ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
       ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
       rawv6_push_pending_frames net/ipv6/raw.c:613
       rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
       inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
       sock_sendmsg_nosec net/socket.c:635
       sock_sendmsg+0xca/0x110 net/socket.c:645
       sock_write_iter+0x326/0x620 net/socket.c:848
       new_sync_write fs/read_write.c:499
       __vfs_write+0x483/0x760 fs/read_write.c:512
       vfs_write+0x187/0x530 fs/read_write.c:560
       SYSC_write fs/read_write.c:607
       SyS_write+0xfb/0x230 fs/read_write.c:599
       entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
      RIP: 0033:0x7ff26e6f5b79
      RSP: 002b:00007ff268e0ed98 EFLAGS: 00000206 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 00007ff268e0f9c0 RCX: 00007ff26e6f5b79
      RDX: 0000000000000010 RSI: 0000000020f50fe1 RDI: 0000000000000003
      RBP: 00007ff26ebc1220 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000
      R13: 00007ff268e0f9c0 R14: 00007ff26efec040 R15: 0000000000000003
      
      The buggy address belongs to the object at ffff880062da0000
       which belongs to the cache RAWv6 of size 1504
      The buggy address ffff880062da0060 is located 96 bytes inside
       of 1504-byte region [ffff880062da0000, ffff880062da05e0)
      
      Freed by task 4113:
       save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
       save_stack+0x43/0xd0 mm/kasan/kasan.c:502
       set_track mm/kasan/kasan.c:514
       kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
       slab_free_hook mm/slub.c:1352
       slab_free_freelist_hook mm/slub.c:1374
       slab_free mm/slub.c:2951
       kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
       sk_prot_free net/core/sock.c:1377
       __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
       sk_destruct+0x47/0x80 net/core/sock.c:1460
       __sk_free+0x57/0x230 net/core/sock.c:1468
       sk_free+0x23/0x30 net/core/sock.c:1479
       sock_put ./include/net/sock.h:1638
       sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
       rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
       inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
       inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
       sock_release+0x8d/0x1e0 net/socket.c:599
       sock_close+0x16/0x20 net/socket.c:1063
       __fput+0x332/0x7f0 fs/file_table.c:208
       ____fput+0x15/0x20 fs/file_table.c:244
       task_work_run+0x19b/0x270 kernel/task_work.c:116
       exit_task_work ./include/linux/task_work.h:21
       do_exit+0x186b/0x2800 kernel/exit.c:839
       do_group_exit+0x149/0x420 kernel/exit.c:943
       SYSC_exit_group kernel/exit.c:954
       SyS_exit_group+0x1d/0x20 kernel/exit.c:952
       entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
      
      Allocated by task 4115:
       save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
       save_stack+0x43/0xd0 mm/kasan/kasan.c:502
       set_track mm/kasan/kasan.c:514
       kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605
       kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
       slab_post_alloc_hook mm/slab.h:432
       slab_alloc_node mm/slub.c:2708
       slab_alloc mm/slub.c:2716
       kmem_cache_alloc+0x1af/0x250 mm/slub.c:2721
       sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1334
       sk_alloc+0x105/0x1010 net/core/sock.c:1396
       inet6_create+0x44d/0x1150 net/ipv6/af_inet6.c:183
       __sock_create+0x4f6/0x880 net/socket.c:1199
       sock_create net/socket.c:1239
       SYSC_socket net/socket.c:1269
       SyS_socket+0xf9/0x230 net/socket.c:1249
       entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
      
      Memory state around the buggy address:
       ffff880062d9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff880062d9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff880062da0000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                             ^
       ffff880062da0080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff880062da0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      48cac18e
    • E
      net: net_enable_timestamp() can be called from irq contexts · 13baa00a
      Eric Dumazet 提交于
      It is now very clear that silly TCP listeners might play with
      enabling/disabling timestamping while new children are added
      to their accept queue.
      
      Meaning net_enable_timestamp() can be called from BH context
      while current state of the static key is not enabled.
      
      Lets play safe and allow all contexts.
      
      The work queue is scheduled only under the problematic cases,
      which are the static key enable/disable transition, to not slow down
      critical paths.
      
      This extends and improves what we did in commit 5fa8bbda ("net: use
      a work queue to defer net_disable_timestamp() work")
      
      Fixes: b90e5794 ("net: dont call jump_label_dec from irq context")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      13baa00a
    • A
      net: don't call strlen() on the user buffer in packet_bind_spkt() · 540e2894
      Alexander Potapenko 提交于
      KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
      uninitialized memory in packet_bind_spkt():
      Acked-by: NEric Dumazet <edumazet@google.com>
      
      ==================================================================
      BUG: KMSAN: use of unitialized memory
      CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
      01/01/2011
       0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48
       ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550
       0000000000000000 0000000000000092 00000000ec400911 0000000000000002
      Call Trace:
       [<     inline     >] __dump_stack lib/dump_stack.c:15
       [<ffffffff82559ae8>] dump_stack+0x238/0x290 lib/dump_stack.c:51
       [<ffffffff818a6626>] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
       [<ffffffff818a783b>] __msan_warning+0x5b/0xb0
      mm/kmsan/kmsan_instr.c:424
       [<     inline     >] strlen lib/string.c:484
       [<ffffffff8259b58d>] strlcpy+0x9d/0x200 lib/string.c:144
       [<ffffffff84b2eca4>] packet_bind_spkt+0x144/0x230
      net/packet/af_packet.c:3132
       [<ffffffff84242e4d>] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
       [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
       [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f
      arch/x86/entry/entry_64.o:?
      chained origin: 00000000eba00911
       [<ffffffff810bb787>] save_stack_trace+0x27/0x50
      arch/x86/kernel/stacktrace.c:67
       [<     inline     >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
       [<     inline     >] kmsan_save_stack mm/kmsan/kmsan.c:334
       [<ffffffff818a59f8>] kmsan_internal_chain_origin+0x118/0x1e0
      mm/kmsan/kmsan.c:527
       [<ffffffff818a7773>] __msan_set_alloca_origin4+0xc3/0x130
      mm/kmsan/kmsan_instr.c:380
       [<ffffffff84242b69>] SYSC_bind+0x129/0x5f0 net/socket.c:1356
       [<ffffffff84242a22>] SyS_bind+0x82/0xa0 net/socket.c:1356
       [<ffffffff8515991b>] entry_SYSCALL_64_fastpath+0x13/0x8f
      arch/x86/entry/entry_64.o:?
      origin description: ----address@SYSC_bind (origin=00000000eb400911)
      ==================================================================
      (the line numbers are relative to 4.8-rc6, but the bug persists
      upstream)
      
      , when I run the following program as root:
      
      =====================================
       #include <string.h>
       #include <sys/socket.h>
       #include <netpacket/packet.h>
       #include <net/ethernet.h>
      
       int main() {
         struct sockaddr addr;
         memset(&addr, 0xff, sizeof(addr));
         addr.sa_family = AF_PACKET;
         int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL));
         bind(fd, &addr, sizeof(addr));
         return 0;
       }
      =====================================
      
      This happens because addr.sa_data copied from the userspace is not
      zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
      results in calling strlen() on the kernel copy of that non-terminated
      buffer.
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      540e2894
    • M
      net: bridge: allow IPv6 when multicast flood is disabled · 8953de2f
      Mike Manning 提交于
      Even with multicast flooding turned off, IPv6 ND should still work so
      that IPv6 connectivity is provided. Allow this by continuing to flood
      multicast traffic originated by us.
      
      Fixes: b6cb5ac8 ("net: bridge: add per-port multicast flood flag")
      Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NMike Manning <mmanning@brocade.com>
      Acked-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8953de2f
    • E
      tcp/dccp: block BH for SYN processing · 449809a6
      Eric Dumazet 提交于
      SYN processing really was meant to be handled from BH.
      
      When I got rid of BH blocking while processing socket backlog
      in commit 5413d1ba ("net: do not block BH while processing socket
      backlog"), I forgot that a malicious user could transition to TCP_LISTEN
      from a state that allowed (SYN) packets to be parked in the socket
      backlog while socket is owned by the thread doing the listen() call.
      
      Sure enough syzkaller found this and reported the bug ;)
      
      =================================
      [ INFO: inconsistent lock state ]
      4.10.0+ #60 Not tainted
      ---------------------------------
      inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
      syz-executor0/5090 [HC0[0]:SC0[0]:HE1:SE1] takes:
       (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
      [<ffffffff83a6a370>] spin_lock include/linux/spinlock.h:299 [inline]
       (&(&hashinfo->ehash_locks[i])->rlock){+.?...}, at:
      [<ffffffff83a6a370>] inet_ehash_insert+0x240/0xad0
      net/ipv4/inet_hashtables.c:407
      {IN-SOFTIRQ-W} state was registered at:
        mark_irqflags kernel/locking/lockdep.c:2923 [inline]
        __lock_acquire+0xbcf/0x3270 kernel/locking/lockdep.c:3295
        lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
        __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
        _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
        spin_lock include/linux/spinlock.h:299 [inline]
        inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
        reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
        inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
        tcp_conn_request+0x25cc/0x3310 net/ipv4/tcp_input.c:6399
        tcp_v4_conn_request+0x157/0x220 net/ipv4/tcp_ipv4.c:1262
        tcp_rcv_state_process+0x802/0x4130 net/ipv4/tcp_input.c:5889
        tcp_v4_do_rcv+0x56b/0x940 net/ipv4/tcp_ipv4.c:1433
        tcp_v4_rcv+0x2e12/0x3210 net/ipv4/tcp_ipv4.c:1711
        ip_local_deliver_finish+0x4ce/0xc40 net/ipv4/ip_input.c:216
        NF_HOOK include/linux/netfilter.h:257 [inline]
        ip_local_deliver+0x1ce/0x710 net/ipv4/ip_input.c:257
        dst_input include/net/dst.h:492 [inline]
        ip_rcv_finish+0xb1d/0x2110 net/ipv4/ip_input.c:396
        NF_HOOK include/linux/netfilter.h:257 [inline]
        ip_rcv+0xd90/0x19c0 net/ipv4/ip_input.c:487
        __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4179
        __netif_receive_skb+0x2a/0x170 net/core/dev.c:4217
        netif_receive_skb_internal+0x1d6/0x430 net/core/dev.c:4245
        napi_skb_finish net/core/dev.c:4602 [inline]
        napi_gro_receive+0x4e6/0x680 net/core/dev.c:4636
        e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4033 [inline]
        e1000_clean_rx_irq+0x5e0/0x1490
      drivers/net/ethernet/intel/e1000/e1000_main.c:4489
        e1000_clean+0xb9a/0x2910 drivers/net/ethernet/intel/e1000/e1000_main.c:3834
        napi_poll net/core/dev.c:5171 [inline]
        net_rx_action+0xe70/0x1900 net/core/dev.c:5236
        __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
        invoke_softirq kernel/softirq.c:364 [inline]
        irq_exit+0x19e/0x1d0 kernel/softirq.c:405
        exiting_irq arch/x86/include/asm/apic.h:658 [inline]
        do_IRQ+0x81/0x1a0 arch/x86/kernel/irq.c:250
        ret_from_intr+0x0/0x20
        native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
        arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
        default_idle+0x8f/0x410 arch/x86/kernel/process.c:271
        arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:262
        default_idle_call+0x36/0x60 kernel/sched/idle.c:96
        cpuidle_idle_call kernel/sched/idle.c:154 [inline]
        do_idle+0x348/0x440 kernel/sched/idle.c:243
        cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:345
        start_secondary+0x344/0x440 arch/x86/kernel/smpboot.c:272
        verify_cpu+0x0/0xfc
      irq event stamp: 1741
      hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
      __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160
      [inline]
      hardirqs last  enabled at (1741): [<ffffffff84d49d77>]
      _raw_spin_unlock_irqrestore+0xf7/0x1a0 kernel/locking/spinlock.c:191
      hardirqs last disabled at (1740): [<ffffffff84d4a732>]
      __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
      hardirqs last disabled at (1740): [<ffffffff84d4a732>]
      _raw_spin_lock_irqsave+0xa2/0x110 kernel/locking/spinlock.c:159
      softirqs last  enabled at (1738): [<ffffffff84d4deff>]
      __do_softirq+0x7cf/0xb7d kernel/softirq.c:310
      softirqs last disabled at (1571): [<ffffffff84d4b92c>]
      do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&(&hashinfo->ehash_locks[i])->rlock);
        <Interrupt>
          lock(&(&hashinfo->ehash_locks[i])->rlock);
      
       *** DEADLOCK ***
      
      1 lock held by syz-executor0/5090:
       #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>] lock_sock
      include/net/sock.h:1460 [inline]
       #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83406b43>]
      sock_setsockopt+0x233/0x1e40 net/core/sock.c:683
      
      stack backtrace:
      CPU: 1 PID: 5090 Comm: syz-executor0 Not tainted 4.10.0+ #60
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:15 [inline]
       dump_stack+0x292/0x398 lib/dump_stack.c:51
       print_usage_bug+0x3ef/0x450 kernel/locking/lockdep.c:2387
       valid_state kernel/locking/lockdep.c:2400 [inline]
       mark_lock_irq kernel/locking/lockdep.c:2602 [inline]
       mark_lock+0xf30/0x1410 kernel/locking/lockdep.c:3065
       mark_irqflags kernel/locking/lockdep.c:2941 [inline]
       __lock_acquire+0x6dc/0x3270 kernel/locking/lockdep.c:3295
       lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
       __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
       _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
       spin_lock include/linux/spinlock.h:299 [inline]
       inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
       reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
       inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
       dccp_v6_conn_request+0xada/0x11b0 net/dccp/ipv6.c:380
       dccp_rcv_state_process+0x51e/0x1660 net/dccp/input.c:606
       dccp_v6_do_rcv+0x213/0x350 net/dccp/ipv6.c:632
       sk_backlog_rcv include/net/sock.h:896 [inline]
       __release_sock+0x127/0x3a0 net/core/sock.c:2052
       release_sock+0xa5/0x2b0 net/core/sock.c:2539
       sock_setsockopt+0x60f/0x1e40 net/core/sock.c:1016
       SYSC_setsockopt net/socket.c:1782 [inline]
       SyS_setsockopt+0x2fb/0x3a0 net/socket.c:1765
       entry_SYSCALL_64_fastpath+0x1f/0xc2
      RIP: 0033:0x4458b9
      RSP: 002b:00007fe8b26c2b58 EFLAGS: 00000292 ORIG_RAX: 0000000000000036
      RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00000000004458b9
      RDX: 000000000000001a RSI: 0000000000000001 RDI: 0000000000000006
      RBP: 00000000006e2110 R08: 0000000000000010 R09: 0000000000000000
      R10: 00000000208c3000 R11: 0000000000000292 R12: 0000000000708000
      R13: 0000000020000000 R14: 0000000000001000 R15: 0000000000000000
      
      Fixes: 5413d1ba ("net: do not block BH while processing socket backlog")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      449809a6
    • Y
      bridge: Fix error path in nbp_vlan_init · df2c4334
      Yotam Gigi 提交于
      Fix error path order in nbp_vlan_init, so if switchdev_port_attr_set
      call failes, the vlan_hash wouldn't be destroyed before inited.
      
      Fixes: efa5356b ("bridge: per vlan dst_metadata netlink support")
      CC: Roopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NYotam Gigi <yotamg@mellanox.com>
      Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df2c4334
    • L
      net: route: add missing nla_policy entry for RTA_MARK attribute · 3b45a410
      Liping Zhang 提交于
      This will add stricter validating for RTA_MARK attribute.
      Signed-off-by: NLiping Zhang <zlpnobody@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3b45a410
    • F
      net/ipv6: avoid possible dead locking on addr_gen_mode sysctl · 8c171d6c
      Felix Jia 提交于
      The addr_gen_mode variable can be accessed by both sysctl and netlink.
      Repleacd rtnl_lock() with rtnl_trylock() protect the sysctl operation to
      avoid the possbile dead lock.`
      Signed-off-by: NFelix Jia <felix.jia@alliedtelesis.co.nz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8c171d6c
    • E
      net: solve a NAPI race · 39e6c820
      Eric Dumazet 提交于
      While playing with mlx4 hardware timestamping of RX packets, I found
      that some packets were received by TCP stack with a ~200 ms delay...
      
      Since the timestamp was provided by the NIC, and my probe was added
      in tcp_v4_rcv() while in BH handler, I was confident it was not
      a sender issue, or a drop in the network.
      
      This would happen with a very low probability, but hurting RPC
      workloads.
      
      A NAPI driver normally arms the IRQ after the napi_complete_done(),
      after NAPI_STATE_SCHED is cleared, so that the hard irq handler can grab
      it.
      
      Problem is that if another point in the stack grabs NAPI_STATE_SCHED bit
      while IRQ are not disabled, we might have later an IRQ firing and
      finding this bit set, right before napi_complete_done() clears it.
      
      This can happen with busy polling users, or if gro_flush_timeout is
      used. But some other uses of napi_schedule() in drivers can cause this
      as well.
      
      thread 1                                 thread 2 (could be on same cpu, or not)
      
      // busy polling or napi_watchdog()
      napi_schedule();
      ...
      napi->poll()
      
      device polling:
      read 2 packets from ring buffer
                                                Additional 3rd packet is
      available.
                                                device hard irq
      
                                                // does nothing because
      NAPI_STATE_SCHED bit is owned by thread 1
                                                napi_schedule();
      
      napi_complete_done(napi, 2);
      rearm_irq();
      
      Note that rearm_irq() will not force the device to send an additional
      IRQ for the packet it already signaled (3rd packet in my example)
      
      This patch adds a new NAPI_STATE_MISSED bit, that napi_schedule_prep()
      can set if it could not grab NAPI_STATE_SCHED
      
      Then napi_complete_done() properly reschedules the napi to make sure
      we do not miss something.
      
      Since we manipulate multiple bits at once, use cmpxchg() like in
      sk_busy_loop() to provide proper transactions.
      
      In v2, I changed napi_watchdog() to use a relaxed variant of
      napi_schedule_prep() : No need to set NAPI_STATE_MISSED from this point.
      
      In v3, I added more details in the changelog and clears
      NAPI_STATE_MISSED in busy_poll_stop()
      
      In v4, I added the ideas given by Alexander Duyck in v3 review
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      39e6c820
    • Z
      rds: ib: add the static type to the variables · 4f7bfb39
      Zhu Yanjun 提交于
      The variables rds_ib_mr_1m_pool_size and rds_ib_mr_8k_pool_size
      are used only in the ib.c file. As such, the static type is
      added to limit them in this file.
      
      Cc: Joe Jin <joe.jin@oracle.com>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: NZhu Yanjun <yanjun.zhu@oracle.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f7bfb39
    • X
      sctp: call rcu_read_lock before checking for duplicate transport nodes · 5179b266
      Xin Long 提交于
      Commit cd2b7087 ("sctp: check duplicate node before inserting a
      new transport") called rhltable_lookup() to check for the duplicate
      transport node in transport rhashtable.
      
      But rhltable_lookup() doesn't call rcu_read_lock inside, it could cause
      a use-after-free issue if it tries to dereference the node that another
      cpu has freed it. Note that sock lock can not avoid this as it is per
      sock.
      
      This patch is to fix it by calling rcu_read_lock before checking for
      duplicate transport nodes.
      
      Fixes: cd2b7087 ("sctp: check duplicate node before inserting a new transport")
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5179b266
    • D
      rxrpc: Fix deadlock between call creation and sendmsg/recvmsg · 540b1c48
      David Howells 提交于
      All the routines by which rxrpc is accessed from the outside are serialised
      by means of the socket lock (sendmsg, recvmsg, bind,
      rxrpc_kernel_begin_call(), ...) and this presents a problem:
      
       (1) If a number of calls on the same socket are in the process of
           connection to the same peer, a maximum of four concurrent live calls
           are permitted before further calls need to wait for a slot.
      
       (2) If a call is waiting for a slot, it is deep inside sendmsg() or
           rxrpc_kernel_begin_call() and the entry function is holding the socket
           lock.
      
       (3) sendmsg() and recvmsg() or the in-kernel equivalents are prevented
           from servicing the other calls as they need to take the socket lock to
           do so.
      
       (4) The socket is stuck until a call is aborted and makes its slot
           available to the waiter.
      
      Fix this by:
      
       (1) Provide each call with a mutex ('user_mutex') that arbitrates access
           by the users of rxrpc separately for each specific call.
      
       (2) Make rxrpc_sendmsg() and rxrpc_recvmsg() unlock the socket as soon as
           they've got a call and taken its mutex.
      
           Note that I'm returning EWOULDBLOCK from recvmsg() if MSG_DONTWAIT is
           set but someone else has the lock.  Should I instead only return
           EWOULDBLOCK if there's nothing currently to be done on a socket, and
           sleep in this particular instance because there is something to be
           done, but we appear to be blocked by the interrupt handler doing its
           ping?
      
       (3) Make rxrpc_new_client_call() unlock the socket after allocating a new
           call, locking its user mutex and adding it to the socket's call tree.
           The call is returned locked so that sendmsg() can add data to it
           immediately.
      
           From the moment the call is in the socket tree, it is subject to
           access by sendmsg() and recvmsg() - even if it isn't connected yet.
      
       (4) Lock new service calls in the UDP data_ready handler (in
           rxrpc_new_incoming_call()) because they may already be in the socket's
           tree and the data_ready handler makes them live immediately if a user
           ID has already been preassigned.
      
           Note that the new call is locked before any notifications are sent
           that it is live, so doing mutex_trylock() *ought* to always succeed.
           Userspace is prevented from doing sendmsg() on calls that are in a
           too-early state in rxrpc_do_sendmsg().
      
       (5) Make rxrpc_new_incoming_call() return the call with the user mutex
           held so that a ping can be scheduled immediately under it.
      
           Note that it might be worth moving the ping call into
           rxrpc_new_incoming_call() and then we can drop the mutex there.
      
       (6) Make rxrpc_accept_call() take the lock on the call it is accepting and
           release the socket after adding the call to the socket's tree.  This
           is slightly tricky as we've dequeued the call by that point and have
           to requeue it.
      
           Note that requeuing emits a trace event.
      
       (7) Make rxrpc_kernel_send_data() and rxrpc_kernel_recv_data() take the
           new mutex immediately and don't bother with the socket mutex at all.
      
      This patch has the nice bonus that calls on the same socket are now to some
      extent parallelisable.
      
      Note that we might want to move rxrpc_service_prealloc() calls out from the
      socket lock and give it its own lock, so that we don't hang progress in
      other calls because we're waiting for the allocator.
      
      We probably also want to avoid calling rxrpc_notify_socket() from within
      the socket lock (rxrpc_accept_call()).
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMarc Dionne <marc.c.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      540b1c48
  2. 28 2月, 2017 6 次提交
  3. 27 2月, 2017 18 次提交
    • S
      mac80211: fix typo in debug print · 09e0a2fe
      Sara Sharon 提交于
      Signed-off-by: NSara Sharon <sara.sharon@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      09e0a2fe
    • S
      mac80211: shorten debug message · 2595d259
      Sara Sharon 提交于
      Tracing is limited to 100 characters and this message passes
      the limit when there are a few buffered frames. Shorten it.
      Signed-off-by: NSara Sharon <sara.sharon@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      2595d259
    • E
      mac80211: fix power saving clients handling in iwlwifi · d98937f4
      Emmanuel Grumbach 提交于
      iwlwifi now supports RSS and can't let mac80211 track the
      PS state based on the Rx frames since they can come out of
      order. iwlwifi is now advertising AP_LINK_PS, and uses
      explicit notifications to teach mac80211 about the PS state
      of the stations and the PS poll / uAPSD trigger frames
      coming our way from the peers.
      
      Because of that, the TIM stopped being maintained in
      mac80211. I tried to fix this in commit c68df2e7
      ("mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE")
      but that was later reverted by Felix in commit 6c18a6b4
      ("Revert "mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE")
      since it broke drivers that do not implement set_tim.
      
      Since none of the drivers that set AP_LINK_PS have the
      set_tim() handler set besides iwlwifi, I can bail out in
      __sta_info_recalc_tim if AP_LINK_PS AND .set_tim is not
      implemented.
      Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      d98937f4
    • F
      mac80211: don't handle filtered frames within a BA session · 890030d3
      Felix Fietkau 提交于
      When running a BA session, the driver (or the hardware) already takes
      care of retransmitting failed frames, since it has to keep the receiver
      reorder window in sync.
      
      Adding another layer of retransmit around that does not improve
      anything. In fact, it can only lead to some strong reordering with huge
      latency.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NFelix Fietkau <nbd@nbd.name>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      890030d3
    • S
      mac80211: don't reorder frames with SN smaller than SSN · b7540d8f
      Sara Sharon 提交于
      When RX aggregation starts, transmitter may continue send frames
      with SN smaller than SSN until the AddBA response is received.
      However, the reorder buffer is already initialized at this point,
      which will cause the drop of such frames as duplicates since the
      head SN of the reorder buffer is set to the SSN, which is bigger.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NSara Sharon <sara.sharon@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      b7540d8f
    • M
      mac80211: flush delayed work when entering suspend · a9e9200d
      Matt Chen 提交于
      The issue was found when entering suspend and resume.
      It triggers a warning in:
      mac80211/key.c: ieee80211_enable_keys()
      ...
      WARN_ON_ONCE(sdata->crypto_tx_tailroom_needed_cnt ||
                   sdata->crypto_tx_tailroom_pending_dec);
      ...
      
      It points out sdata->crypto_tx_tailroom_pending_dec isn't cleaned up successfully
      in a delayed_work during suspend. Add a flush_delayed_work to fix it.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NMatt Chen <matt.chen@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      a9e9200d
    • J
      mac80211: fix packet statistics for fast-RX · 0328edc7
      Johannes Berg 提交于
      When adding per-CPU statistics, which added statistics back
      to mac80211 for the fast-RX path, I evidently forgot to add
      the "stats->packets++" line. The reason for that is likely
      that I didn't see it since it's done in defragmentation for
      the regular RX path.
      
      Add the missing line to properly count received packets in
      the fast-RX case.
      
      Fixes: c9c5962b ("mac80211: enable collecting station statistics per-CPU")
      Reported-by: NOren Givon <oren.givon@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      0328edc7
    • P
      l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv · 51fb60eb
      Paul Hüber 提交于
      l2tp_ip_backlog_recv may not return -1 if the packet gets dropped.
      The return value is passed up to ip_local_deliver_finish, which treats
      negative values as an IP protocol number for resubmission.
      Signed-off-by: NPaul Hüber <phueber@kernsp.in>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51fb60eb
    • J
      xfrm: provide correct dst in xfrm_neigh_lookup · 1ecc9ad0
      Julian Anastasov 提交于
      Fix xfrm_neigh_lookup to provide dst->path to the
      neigh_lookup dst_ops method.
      
      When skb is provided, the IP address in packet should already
      match the dst->path address family. But for the non-skb case,
      we should consider the last tunnel address as nexthop address.
      
      Fixes: f894cbf8 ("net: Add optional SKB arg to dst_ops->neigh_lookup().")
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1ecc9ad0
    • R
      net sched actions: do not overwrite status of action creation. · 37f1c63e
      Roman Mashak 提交于
      nla_memdup_cookie was overwriting err value, declared at function
      scope and earlier initialized with result of ->init(). At success
      nla_memdup_cookie() returns 0, and thus module refcnt decremented,
      although the action was installed.
      
      $ sudo tc actions add action pass index 1 cookie 1234
      $ sudo tc actions ls action gact
      
              action order 0: gact action pass
               random type none pass val 0
               index 1 ref 1 bind 0
      $
      $ lsmod
      Module                  Size  Used by
      act_gact               16384  0
      ...
      $
      $ sudo rmmod act_gact
      [   52.310283] ------------[ cut here ]------------
      [   52.312551] WARNING: CPU: 1 PID: 455 at kernel/module.c:1113
      module_put+0x99/0xa0
      [   52.316278] Modules linked in: act_gact(-) crct10dif_pclmul crc32_pclmul
      ghash_clmulni_intel psmouse pcbc evbug aesni_intel aes_x86_64 crypto_simd
      serio_raw glue_helper pcspkr cryptd
      [   52.322285] CPU: 1 PID: 455 Comm: rmmod Not tainted 4.10.0+ #11
      [   52.324261] Call Trace:
      [   52.325132]  dump_stack+0x63/0x87
      [   52.326236]  __warn+0xd1/0xf0
      [   52.326260]  warn_slowpath_null+0x1d/0x20
      [   52.326260]  module_put+0x99/0xa0
      [   52.326260]  tcf_hashinfo_destroy+0x7f/0x90
      [   52.326260]  gact_exit_net+0x27/0x40 [act_gact]
      [   52.326260]  ops_exit_list.isra.6+0x38/0x60
      [   52.326260]  unregister_pernet_operations+0x90/0xe0
      [   52.326260]  unregister_pernet_subsys+0x21/0x30
      [   52.326260]  tcf_unregister_action+0x68/0xa0
      [   52.326260]  gact_cleanup_module+0x17/0xa0f [act_gact]
      [   52.326260]  SyS_delete_module+0x1ba/0x220
      [   52.326260]  entry_SYSCALL_64_fastpath+0x1e/0xad
      [   52.326260] RIP: 0033:0x7f527ffae367
      [   52.326260] RSP: 002b:00007ffeb402a598 EFLAGS: 00000202 ORIG_RAX:
      00000000000000b0
      [   52.326260] RAX: ffffffffffffffda RBX: 0000559b069912a0 RCX: 00007f527ffae367
      [   52.326260] RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559b06991308
      [   52.326260] RBP: 0000000000000003 R08: 00007f5280264420 R09: 00007ffeb4029511
      [   52.326260] R10: 000000000000087b R11: 0000000000000202 R12: 00007ffeb4029580
      [   52.326260] R13: 0000000000000000 R14: 0000000000000000 R15: 0000559b069912a0
      [   52.354856] ---[ end trace 90d89401542b0db6 ]---
      $
      
      With the fix:
      
      $ sudo modprobe act_gact
      $ lsmod
      Module                  Size  Used by
      act_gact               16384  0
      ...
      $ sudo tc actions add action pass index 1 cookie 1234
      $ sudo tc actions ls action gact
      
              action order 0: gact action pass
               random type none pass val 0
               index 1 ref 1 bind 0
      $
      $ lsmod
      Module                  Size  Used by
      act_gact               16384  1
      ...
      $ sudo rmmod act_gact
      rmmod: ERROR: Module act_gact is in use
      $
      $ sudo /home/mrv/bin/tc actions del action gact index 1
      $ sudo rmmod act_gact
      $ lsmod
      Module                  Size  Used by
      $
      
      Fixes: 1045ba77 ("net sched actions: Add support for user cookies")
      Signed-off-by: NRoman Mashak <mrv@mojatatu.com>
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      37f1c63e
    • D
      rxrpc: Kernel calls get stuck in recvmsg · d7e15835
      David Howells 提交于
      Calls made through the in-kernel interface can end up getting stuck because
      of a missed variable update in a loop in rxrpc_recvmsg_data().  The problem
      is like this:
      
       (1) A new packet comes in and doesn't cause a notification to be given to
           the client as there's still another packet in the ring - the
           assumption being that if the client will keep drawing off data until
           the ring is empty.
      
       (2) The client is in rxrpc_recvmsg_data(), inside the big while loop that
           iterates through the packets.  This copies the window pointers into
           variables rather than using the information in the call struct
           because:
      
           (a) MSG_PEEK might be in effect;
      
           (b) we need a barrier after reading call->rx_top to pair with the
           	 barrier in the softirq routine that loads the buffer.
      
       (3) The reading of call->rx_top is done outside of the loop, and top is
           never updated whilst we're in the loop.  This means that even through
           there's a new packet available, we don't see it and may return -EFAULT
           to the caller - who will happily return to the scheduler and await the
           next notification.
      
       (4) No further notifications are forthcoming until there's an abort as the
           ring isn't empty.
      
      The fix is to move the read of call->rx_top inside the loop - but it needs
      to be done before the condition is checked.
      Reported-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Tested-by: NMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d7e15835
    • R
      net sched actions: decrement module reference count after table flush. · edb9d1bf
      Roman Mashak 提交于
      When tc actions are loaded as a module and no actions have been installed,
      flushing them would result in actions removed from the memory, but modules
      reference count not being decremented, so that the modules would not be
      unloaded.
      
      Following is example with GACT action:
      
      % sudo modprobe act_gact
      % lsmod
      Module                  Size  Used by
      act_gact               16384  0
      %
      % sudo tc actions ls action gact
      %
      % sudo tc actions flush action gact
      % lsmod
      Module                  Size  Used by
      act_gact               16384  1
      % sudo tc actions flush action gact
      % lsmod
      Module                  Size  Used by
      act_gact               16384  2
      % sudo rmmod act_gact
      rmmod: ERROR: Module act_gact is in use
      ....
      
      After the fix:
      % lsmod
      Module                  Size  Used by
      act_gact               16384  0
      %
      % sudo tc actions add action pass index 1
      % sudo tc actions add action pass index 2
      % sudo tc actions add action pass index 3
      % lsmod
      Module                  Size  Used by
      act_gact               16384  3
      %
      % sudo tc actions flush action gact
      % lsmod
      Module                  Size  Used by
      act_gact               16384  0
      %
      % sudo tc actions flush action gact
      % lsmod
      Module                  Size  Used by
      act_gact               16384  0
      % sudo rmmod act_gact
      % lsmod
      Module                  Size  Used by
      %
      
      Fixes: f97017cd ("net-sched: Fix actions flushing")
      Signed-off-by: NRoman Mashak <mrv@mojatatu.com>
      Signed-off-by: NJamal Hadi Salim <jhs@mojatatu.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      edb9d1bf
    • X
      ipv6: check sk sk_type and protocol early in ip_mroute_set/getsockopt · 99253eb7
      Xin Long 提交于
      Commit 5e1859fb ("ipv4: ipmr: various fixes and cleanups") fixed
      the issue for ipv4 ipmr:
      
        ip_mroute_setsockopt() & ip_mroute_getsockopt() should not
        access/set raw_sk(sk)->ipmr_table before making sure the socket
        is a raw socket, and protocol is IGMP
      
      The same fix should be done for ipv6 ipmr as well.
      
      This patch can fix the panic caused by overwriting the same offset
      as ipmr_table as in raw_sk(sk) when accessing other type's socket
      by ip_mroute_setsockopt().
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      99253eb7
    • X
      sctp: set sin_port for addr param when checking duplicate address · 2e3ce5bc
      Xin Long 提交于
      Commit b8607805 ("sctp: not copying duplicate addrs to the assoc's
      bind address list") tried to check for duplicate address before copying
      to asoc's bind_addr list from global addr list.
      
      But all the addrs' sin_ports in global addr list are 0 while the addrs'
      sin_ports are bp->port in asoc's bind_addr list. It means even if it's
      a duplicate address, af->cmp_addr will still return 0 as the their
      sin_ports are different.
      
      This patch is to fix it by setting the sin_port for addr param with
      bp->port before comparing the addrs.
      
      Fixes: b8607805 ("sctp: not copying duplicate addrs to the assoc's bind address list")
      Reported-by: NWei Chen <weichen@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2e3ce5bc
    • P
      netfilter: nft_set_bitmap: incorrect bitmap size · 13aa5a8f
      Pablo Neira Ayuso 提交于
      priv->bitmap_size stores the real bitmap size, instead of the full
      struct nft_bitmap object.
      
      Fixes: 665153ff ("netfilter: nf_tables: add bitmap set type")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      13aa5a8f
    • J
      netfilter: nf_ct_expect: Change __nf_ct_expect_check() return value. · 4b86c459
      Jarno Rajahalme 提交于
      Commit 4dee62b1 ("netfilter: nf_ct_expect: nf_ct_expect_insert()
      returns void") inadvertently changed the successful return value of
      nf_ct_expect_related_report() from 0 to 1 due to
      __nf_ct_expect_check() returning 1 on success.  Prevent this
      regression in the future by changing the return value of
      __nf_ct_expect_check() to 0 on success.
      Signed-off-by: NJarno Rajahalme <jarno@ovn.org>
      Acked-by: NJoe Stringer <joe@ovn.org>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      4b86c459
    • J
      ipv4: mask tos for input route · 6e28099d
      Julian Anastasov 提交于
      Restore the lost masking of TOS in input route code to
      allow ip rules to match it properly.
      
      Problem [1] noticed by Shmulik Ladkani <shmulik.ladkani@gmail.com>
      
      [1] http://marc.info/?t=137331755300040&r=1&w=2
      
      Fixes: 89aef892 ("ipv4: Delete routing cache.")
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6e28099d
    • J
      ipv4: add missing initialization for flowi4_uid · 8bcfd092
      Julian Anastasov 提交于
      Avoid matching of random stack value for uid when rules
      are looked up on input route or when RP filter is used.
      Problem should affect only setups that use ip rules with
      uid range.
      
      Fixes: 622ec2c9 ("net: core: add UID to flows, rules, and routes")
      Signed-off-by: NJulian Anastasov <ja@ssi.bg>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8bcfd092
  4. 25 2月, 2017 4 次提交