1. 12 9月, 2018 4 次提交
  2. 11 9月, 2018 5 次提交
  3. 10 9月, 2018 1 次提交
    • T
      ip: frags: fix crash in ip_do_fragment() · 5d407b07
      Taehee Yoo 提交于
      A kernel crash occurrs when defragmented packet is fragmented
      in ip_do_fragment().
      In defragment routine, skb_orphan() is called and
      skb->ip_defrag_offset is set. but skb->sk and
      skb->ip_defrag_offset are same union member. so that
      frag->sk is not NULL.
      Hence crash occurrs in skb->sk check routine in ip_do_fragment() when
      defragmented packet is fragmented.
      
      test commands:
         %iptables -t nat -I POSTROUTING -j MASQUERADE
         %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000
      
      splat looks like:
      [  261.069429] kernel BUG at net/ipv4/ip_output.c:636!
      [  261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [  261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3
      [  261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600
      [  261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c
      [  261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202
      [  261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004
      [  261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8
      [  261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395
      [  261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4
      [  261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000
      [  261.174169] FS:  00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
      [  261.183012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0
      [  261.198158] Call Trace:
      [  261.199018]  ? dst_output+0x180/0x180
      [  261.205011]  ? save_trace+0x300/0x300
      [  261.209018]  ? ip_copy_metadata+0xb00/0xb00
      [  261.213034]  ? sched_clock_local+0xd4/0x140
      [  261.218158]  ? kill_l4proto+0x120/0x120 [nf_conntrack]
      [  261.223014]  ? rt_cpu_seq_stop+0x10/0x10
      [  261.227014]  ? find_held_lock+0x39/0x1c0
      [  261.233008]  ip_finish_output+0x51d/0xb50
      [  261.237006]  ? ip_fragment.constprop.56+0x220/0x220
      [  261.243011]  ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack]
      [  261.250152]  ? rcu_is_watching+0x77/0x120
      [  261.255010]  ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4]
      [  261.261033]  ? nf_hook_slow+0xb1/0x160
      [  261.265007]  ip_output+0x1c7/0x710
      [  261.269005]  ? ip_mc_output+0x13f0/0x13f0
      [  261.273002]  ? __local_bh_enable_ip+0xe9/0x1b0
      [  261.278152]  ? ip_fragment.constprop.56+0x220/0x220
      [  261.282996]  ? nf_hook_slow+0xb1/0x160
      [  261.287007]  raw_sendmsg+0x21f9/0x4420
      [  261.291008]  ? dst_output+0x180/0x180
      [  261.297003]  ? sched_clock_cpu+0x126/0x170
      [  261.301003]  ? find_held_lock+0x39/0x1c0
      [  261.306155]  ? stop_critical_timings+0x420/0x420
      [  261.311004]  ? check_flags.part.36+0x450/0x450
      [  261.315005]  ? _raw_spin_unlock_irq+0x29/0x40
      [  261.320995]  ? _raw_spin_unlock_irq+0x29/0x40
      [  261.326142]  ? cyc2ns_read_end+0x10/0x10
      [  261.330139]  ? raw_bind+0x280/0x280
      [  261.334138]  ? sched_clock_cpu+0x126/0x170
      [  261.338995]  ? check_flags.part.36+0x450/0x450
      [  261.342991]  ? __lock_acquire+0x4500/0x4500
      [  261.348994]  ? inet_sendmsg+0x11c/0x500
      [  261.352989]  ? dst_output+0x180/0x180
      [  261.357012]  inet_sendmsg+0x11c/0x500
      [ ... ]
      
      v2:
       - clear skb->sk at reassembly routine.(Eric Dumarzet)
      
      Fixes: fa0f5273 ("ip: use rb trees for IP frag queue.")
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d407b07
  4. 09 9月, 2018 1 次提交
    • V
      net/tls: Set count of SG entries if sk_alloc_sg returns -ENOSPC · 52ea992c
      Vakul Garg 提交于
      tls_sw_sendmsg() allocates plaintext and encrypted SG entries using
      function sk_alloc_sg(). In case the number of SG entries hit
      MAX_SKB_FRAGS, sk_alloc_sg() returns -ENOSPC and sets the variable for
      current SG index to '0'. This leads to calling of function
      tls_push_record() with 'sg_encrypted_num_elem = 0' and later causes
      kernel crash. To fix this, set the number of SG elements to the number
      of elements in plaintext/encrypted SG arrays in case sk_alloc_sg()
      returns -ENOSPC.
      
      Fixes: 3c4d7559 ("tls: kernel TLS support")
      Signed-off-by: NVakul Garg <vakul.garg@nxp.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      52ea992c
  5. 08 9月, 2018 2 次提交
  6. 07 9月, 2018 1 次提交
    • C
      tipc: call start and done ops directly in __tipc_nl_compat_dumpit() · 8f5c5fcf
      Cong Wang 提交于
      __tipc_nl_compat_dumpit() uses a netlink_callback on stack,
      so the only way to align it with other ->dumpit() call path
      is calling tipc_dump_start() and tipc_dump_done() directly
      inside it. Otherwise ->dumpit() would always get NULL from
      cb->args[].
      
      But tipc_dump_start() uses sock_net(cb->skb->sk) to retrieve
      net pointer, the cb->skb here doesn't set skb->sk, the net pointer
      is saved in msg->net instead, so introduce a helper function
      __tipc_dump_start() to pass in msg->net.
      
      Ying pointed out cb->args[0...3] are already used by other
      callbacks on this call path, so we can't use cb->args[0] any
      more, use cb->args[4] instead.
      
      Fixes: 9a07efa9 ("tipc: switch to rhashtable iterator")
      Reported-and-tested-by: syzbot+e93a2c41f91b8e2c7d9b@syzkaller.appspotmail.com
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f5c5fcf
  7. 06 9月, 2018 5 次提交
    • J
      net/iucv: declare iucv_path_table_empty() as static · b7f41565
      Julian Wiedmann 提交于
      Fixes a compile warning.
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7f41565
    • J
      net/af_iucv: fix skb handling on HiperTransport xmit error · b2f54394
      Julian Wiedmann 提交于
      When sending an skb, afiucv_hs_send() bails out on various error
      conditions. But currently the caller has no way of telling whether the
      skb was freed or not - resulting in potentially either
      a) leaked skbs from iucv_send_ctrl(), or
      b) double-free's from iucv_sock_sendmsg().
      
      As dev_queue_xmit() will always consume the skb (even on error), be
      consistent and also free the skb from all other error paths. This way
      callers no longer need to care about managing the skb.
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: NUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2f54394
    • J
      net/af_iucv: drop inbound packets with invalid flags · 22244099
      Julian Wiedmann 提交于
      Inbound packets may have any combination of flag bits set in their iucv
      header. If we don't know how to handle a specific combination, drop the
      skb instead of leaking it.
      
      To clarify what error is returned in this case, replace the hard-coded
      0 with the corresponding macro.
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22244099
    • D
      net/sched: fix memory leak in act_tunnel_key_init() · ee28bb56
      Davide Caratti 提交于
      If users try to install act_tunnel_key 'set' rules with duplicate values
      of 'index', the tunnel metadata are allocated, but never released. Then,
      kmemleak complains as follows:
      
       # tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111
       # echo clear > /sys/kernel/debug/kmemleak
       # tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111
       Error: TC IDR already exists.
       We have an error talking to the kernel
       # echo scan > /sys/kernel/debug/kmemleak
       # cat /sys/kernel/debug/kmemleak
       unreferenced object 0xffff8800574e6c80 (size 256):
         comm "tc", pid 5617, jiffies 4298118009 (age 57.990s)
         hex dump (first 32 bytes):
           00 00 00 00 00 00 00 00 00 1c e8 b0 ff ff ff ff  ................
           81 24 c2 ad ff ff ff ff 00 00 00 00 00 00 00 00  .$..............
         backtrace:
           [<00000000b7afbf4e>] tunnel_key_init+0x8a5/0x1800 [act_tunnel_key]
           [<000000007d98fccd>] tcf_action_init_1+0x698/0xac0
           [<0000000099b8f7cc>] tcf_action_init+0x15c/0x590
           [<00000000dc60eebe>] tc_ctl_action+0x336/0x5c2
           [<000000002f5a2f7d>] rtnetlink_rcv_msg+0x357/0x8e0
           [<000000000bfe7575>] netlink_rcv_skb+0x124/0x350
           [<00000000edab656f>] netlink_unicast+0x40f/0x5d0
           [<00000000b322cdcb>] netlink_sendmsg+0x6e8/0xba0
           [<0000000063d9d490>] sock_sendmsg+0xb3/0xf0
           [<00000000f0d3315a>] ___sys_sendmsg+0x654/0x960
           [<00000000c06cbd42>] __sys_sendmsg+0xd3/0x170
           [<00000000ce72e4b0>] do_syscall_64+0xa5/0x470
           [<000000005caa2d97>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
           [<00000000fac1b476>] 0xffffffffffffffff
      
      This problem theoretically happens also in case users attempt to setup a
      geneve rule having wrong configuration data, or when the kernel fails to
      allocate 'params_new'. Ensure that tunnel_key_init() releases the tunnel
      metadata also in the above conditions.
      
      Addresses-Coverity-ID: 1373974 ("Resource leak")
      Fixes: d0f6dd8a ("net/sched: Introduce act_tunnel_key")
      Fixes: 0ed5269f ("net/sched: add tunnel option support to act_tunnel_key")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee28bb56
    • C
      tipc: orphan sock in tipc_release() · 0a3b8b2b
      Cong Wang 提交于
      Before we unlock the sock in tipc_release(), we have to
      detach sk->sk_socket from sk, otherwise a parallel
      tipc_sk_fill_sock_diag() could stil read it after we
      free this socket.
      
      Fixes: c30b70de ("tipc: implement socket diagnostics for AF_TIPC")
      Reported-and-tested-by: syzbot+48804b87c16588ad491d@syzkaller.appspotmail.com
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a3b8b2b
  8. 05 9月, 2018 2 次提交
    • V
      net: sched: action_ife: take reference to meta module · 84cb8eb2
      Vlad Buslov 提交于
      Recent refactoring of add_metainfo() caused use_all_metadata() to add
      metainfo to ife action metalist without taking reference to module. This
      causes warning in module_put called from ife action cleanup function.
      
      Implement add_metainfo_and_get_ops() function that returns with reference
      to module taken if metainfo was added successfully, and call it from
      use_all_metadata(), instead of calling __add_metainfo() directly.
      
      Example warning:
      
      [  646.344393] WARNING: CPU: 1 PID: 2278 at kernel/module.c:1139 module_put+0x1cb/0x230
      [  646.352437] Modules linked in: act_meta_skbtcindex act_meta_mark act_meta_skbprio act_ife ife veth nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c tun ebtable_filter ebtables ip6table_filter ip6_tables bridge stp llc mlx5_ib ib_uverbs ib_core intel_rapl sb_edac x86_pkg_temp_thermal mlx5_core coretemp kvm_intel kvm nfsd igb irqbypass crct10dif_pclmul devlink crc32_pclmul mei_me joydev ses crc32c_intel enclosure auth_rpcgss i2c_algo_bit ioatdma ptp mei pps_core ghash_clmulni_intel iTCO_wdt iTCO_vendor_support pcspkr dca ipmi_ssif lpc_ich target_core_mod i2c_i801 ipmi_si ipmi_devintf pcc_cpufreq wmi ipmi_msghandler nfs_acl lockd acpi_pad acpi_power_meter grace sunrpc mpt3sas raid_class scsi_transport_sas
      [  646.425631] CPU: 1 PID: 2278 Comm: tc Not tainted 4.19.0-rc1+ #799
      [  646.432187] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [  646.440595] RIP: 0010:module_put+0x1cb/0x230
      [  646.445238] Code: f3 66 94 02 e8 26 ff fa ff 85 c0 74 11 0f b6 1d 51 30 94 02 80 fb 01 77 60 83 e3 01 74 13 65 ff 0d 3a 83 db 73 e9 2b ff ff ff <0f> 0b e9 00 ff ff ff e8 59 01 fb ff 85 c0 75 e4 48 c7 c2 20 62 6b
      [  646.464997] RSP: 0018:ffff880354d37068 EFLAGS: 00010286
      [  646.470599] RAX: 0000000000000000 RBX: ffffffffc0a52518 RCX: ffffffff8c2668db
      [  646.478118] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffffffffc0a52518
      [  646.485641] RBP: ffffffffc0a52180 R08: fffffbfff814a4a4 R09: fffffbfff814a4a3
      [  646.493164] R10: ffffffffc0a5251b R11: fffffbfff814a4a4 R12: 1ffff1006a9a6e0d
      [  646.500687] R13: 00000000ffffffff R14: ffff880362bab890 R15: dead000000000100
      [  646.508213] FS:  00007f4164c99800(0000) GS:ffff88036fe40000(0000) knlGS:0000000000000000
      [  646.516961] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  646.523080] CR2: 00007f41638b8420 CR3: 0000000351df0004 CR4: 00000000001606e0
      [  646.530595] Call Trace:
      [  646.533408]  ? find_symbol_in_section+0x260/0x260
      [  646.538509]  tcf_ife_cleanup+0x11b/0x200 [act_ife]
      [  646.543695]  tcf_action_cleanup+0x29/0xa0
      [  646.548078]  __tcf_action_put+0x5a/0xb0
      [  646.552289]  ? nla_put+0x65/0xe0
      [  646.555889]  __tcf_idr_release+0x48/0x60
      [  646.560187]  tcf_generic_walker+0x448/0x6b0
      [  646.564764]  ? tcf_action_dump_1+0x450/0x450
      [  646.569411]  ? __lock_is_held+0x84/0x110
      [  646.573720]  ? tcf_ife_walker+0x10c/0x20f [act_ife]
      [  646.578982]  tca_action_gd+0x972/0xc40
      [  646.583129]  ? tca_get_fill.constprop.17+0x250/0x250
      [  646.588471]  ? mark_lock+0xcf/0x980
      [  646.592324]  ? check_chain_key+0x140/0x1f0
      [  646.596832]  ? debug_show_all_locks+0x240/0x240
      [  646.601839]  ? memset+0x1f/0x40
      [  646.605350]  ? nla_parse+0xca/0x1a0
      [  646.609217]  tc_ctl_action+0x215/0x230
      [  646.613339]  ? tcf_action_add+0x220/0x220
      [  646.617748]  rtnetlink_rcv_msg+0x56a/0x6d0
      [  646.622227]  ? rtnl_fdb_del+0x3f0/0x3f0
      [  646.626466]  netlink_rcv_skb+0x18d/0x200
      [  646.630752]  ? rtnl_fdb_del+0x3f0/0x3f0
      [  646.634959]  ? netlink_ack+0x500/0x500
      [  646.639106]  netlink_unicast+0x2d0/0x370
      [  646.643409]  ? netlink_attachskb+0x340/0x340
      [  646.648050]  ? _copy_from_iter_full+0xe9/0x3e0
      [  646.652870]  ? import_iovec+0x11e/0x1c0
      [  646.657083]  netlink_sendmsg+0x3b9/0x6a0
      [  646.661388]  ? netlink_unicast+0x370/0x370
      [  646.665877]  ? netlink_unicast+0x370/0x370
      [  646.670351]  sock_sendmsg+0x6b/0x80
      [  646.674212]  ___sys_sendmsg+0x4a1/0x520
      [  646.678443]  ? copy_msghdr_from_user+0x210/0x210
      [  646.683463]  ? lock_downgrade+0x320/0x320
      [  646.687849]  ? debug_show_all_locks+0x240/0x240
      [  646.692760]  ? do_raw_spin_unlock+0xa2/0x130
      [  646.697418]  ? _raw_spin_unlock+0x24/0x30
      [  646.701798]  ? __handle_mm_fault+0x1819/0x1c10
      [  646.706619]  ? __pmd_alloc+0x320/0x320
      [  646.710738]  ? debug_show_all_locks+0x240/0x240
      [  646.715649]  ? restore_nameidata+0x7b/0xa0
      [  646.720117]  ? check_chain_key+0x140/0x1f0
      [  646.724590]  ? check_chain_key+0x140/0x1f0
      [  646.729070]  ? __fget_light+0xbc/0xd0
      [  646.733121]  ? __sys_sendmsg+0xd7/0x150
      [  646.737329]  __sys_sendmsg+0xd7/0x150
      [  646.741359]  ? __ia32_sys_shutdown+0x30/0x30
      [  646.746003]  ? up_read+0x53/0x90
      [  646.749601]  ? __do_page_fault+0x484/0x780
      [  646.754105]  ? do_syscall_64+0x1e/0x2c0
      [  646.758320]  do_syscall_64+0x72/0x2c0
      [  646.762353]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  646.767776] RIP: 0033:0x7f4163872150
      [  646.771713] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
      [  646.791474] RSP: 002b:00007ffdef7d6b58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [  646.799721] RAX: ffffffffffffffda RBX: 0000000000000024 RCX: 00007f4163872150
      [  646.807240] RDX: 0000000000000000 RSI: 00007ffdef7d6bd0 RDI: 0000000000000003
      [  646.814760] RBP: 000000005b8b9482 R08: 0000000000000001 R09: 0000000000000000
      [  646.822286] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffdef7dad20
      [  646.829807] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000679bc0
      [  646.837360] irq event stamp: 6083
      [  646.841043] hardirqs last  enabled at (6081): [<ffffffff8c220a7d>] __call_rcu+0x17d/0x500
      [  646.849882] hardirqs last disabled at (6083): [<ffffffff8c004f06>] trace_hardirqs_off_thunk+0x1a/0x1c
      [  646.859775] softirqs last  enabled at (5968): [<ffffffff8d4004a1>] __do_softirq+0x4a1/0x6ee
      [  646.868784] softirqs last disabled at (6082): [<ffffffffc0a78759>] tcf_ife_cleanup+0x39/0x200 [act_ife]
      [  646.878845] ---[ end trace b1b8c12ffe51e657 ]---
      
      Fixes: 5ffe57da ("act_ife: fix a potential deadlock")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      84cb8eb2
    • C
      act_ife: fix a potential use-after-free · 6d784f16
      Cong Wang 提交于
      Immediately after module_put(), user could delete this
      module, so e->ops could be already freed before we call
      e->ops->release().
      
      Fix this by moving module_put() after ops->release().
      
      Fixes: ef6980b6 ("introduce IFE action")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6d784f16
  9. 04 9月, 2018 6 次提交
    • Z
      tipc: correct spelling errors for tipc_topsrv_queue_evt() comments · a484ef34
      Zhenbo Gao 提交于
      tipc_conn_queue_evt -> tipc_topsrv_queue_evt
      tipc_send_work -> tipc_conn_send_work
      tipc_send_to_sock -> tipc_conn_send_to_sock
      Signed-off-by: NZhenbo Gao <zhenbo.gao@windriver.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a484ef34
    • Z
      tipc: correct spelling errors for struct tipc_bc_base's comment · 9cc1bf39
      Zhenbo Gao 提交于
      Trivial fix for two spelling mistakes.
      Signed-off-by: NZhenbo Gao <zhenbo.gao@windriver.com>
      Reviewed-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9cc1bf39
    • X
      sctp: not traverse asoc trans list if non-ipv6 trans exists for ipv6_flowlabel · 741880e1
      Xin Long 提交于
      When users set params.spp_address and get a trans, ipv6_flowlabel flag
      should be applied into this trans. But even if this one is not an ipv6
      trans, it should not go to apply it into all other transes of the asoc
      but simply ignore it.
      
      Fixes: 0b0dce7a ("sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      741880e1
    • X
      sctp: fix invalid reference to the index variable of the iterator · af8a2b8b
      Xin Long 提交于
      Now in sctp_apply_peer_addr_params(), if SPP_IPV6_FLOWLABEL flag is set
      and trans is NULL, it would use trans as the index variable to traverse
      transport_addr_list, then trans is set as the last transport of it.
      
      Later, if SPP_DSCP flag is set, it would enter into the wrong branch as
      trans is actually an invalid reference.
      
      So fix it by using a new index variable to traverse transport_addr_list
      for both SPP_DSCP and SPP_IPV6_FLOWLABEL flags process.
      
      Fixes: 0b0dce7a ("sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams")
      Reported-by: NJulia Lawall <julia.lawall@lip6.fr>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      af8a2b8b
    • V
      net: sched: null actions array pointer before releasing action · c10bbfae
      Vlad Buslov 提交于
      Currently, tcf_action_delete() nulls actions array pointer after putting
      and deleting it. However, if tcf_idr_delete_index() returns an error,
      pointer to action is not set to null. That results it being released second
      time in error handling code of tca_action_gd().
      
      Kasan error:
      
      [  807.367755] ==================================================================
      [  807.375844] BUG: KASAN: use-after-free in tc_setup_cb_call+0x14e/0x250
      [  807.382763] Read of size 8 at addr ffff88033e636000 by task tc/2732
      
      [  807.391289] CPU: 0 PID: 2732 Comm: tc Tainted: G        W         4.19.0-rc1+ #799
      [  807.399542] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
      [  807.407948] Call Trace:
      [  807.410763]  dump_stack+0x92/0xeb
      [  807.414456]  print_address_description+0x70/0x360
      [  807.419549]  kasan_report+0x14d/0x300
      [  807.423582]  ? tc_setup_cb_call+0x14e/0x250
      [  807.428150]  tc_setup_cb_call+0x14e/0x250
      [  807.432539]  ? nla_put+0x65/0xe0
      [  807.436146]  fl_dump+0x394/0x3f0 [cls_flower]
      [  807.440890]  ? fl_tmplt_dump+0x140/0x140 [cls_flower]
      [  807.446327]  ? lock_downgrade+0x320/0x320
      [  807.450702]  ? lock_acquire+0xe2/0x220
      [  807.454819]  ? is_bpf_text_address+0x5/0x140
      [  807.459475]  ? memcpy+0x34/0x50
      [  807.462980]  ? nla_put+0x65/0xe0
      [  807.466582]  tcf_fill_node+0x341/0x430
      [  807.470717]  ? tcf_block_put+0xe0/0xe0
      [  807.474859]  tcf_node_dump+0xdb/0xf0
      [  807.478821]  fl_walk+0x8e/0x170 [cls_flower]
      [  807.483474]  tcf_chain_dump+0x35a/0x4d0
      [  807.487703]  ? tfilter_notify+0x170/0x170
      [  807.492091]  ? tcf_fill_node+0x430/0x430
      [  807.496411]  tc_dump_tfilter+0x362/0x3f0
      [  807.500712]  ? tc_del_tfilter+0x850/0x850
      [  807.505104]  ? kasan_unpoison_shadow+0x30/0x40
      [  807.509940]  ? __mutex_unlock_slowpath+0xcf/0x410
      [  807.515031]  netlink_dump+0x263/0x4f0
      [  807.519077]  __netlink_dump_start+0x2a0/0x300
      [  807.523817]  ? tc_del_tfilter+0x850/0x850
      [  807.528198]  rtnetlink_rcv_msg+0x46a/0x6d0
      [  807.532671]  ? rtnl_fdb_del+0x3f0/0x3f0
      [  807.536878]  ? tc_del_tfilter+0x850/0x850
      [  807.541280]  netlink_rcv_skb+0x18d/0x200
      [  807.545570]  ? rtnl_fdb_del+0x3f0/0x3f0
      [  807.549773]  ? netlink_ack+0x500/0x500
      [  807.553913]  netlink_unicast+0x2d0/0x370
      [  807.558212]  ? netlink_attachskb+0x340/0x340
      [  807.562855]  ? _copy_from_iter_full+0xe9/0x3e0
      [  807.567677]  ? import_iovec+0x11e/0x1c0
      [  807.571890]  netlink_sendmsg+0x3b9/0x6a0
      [  807.576192]  ? netlink_unicast+0x370/0x370
      [  807.580684]  ? netlink_unicast+0x370/0x370
      [  807.585154]  sock_sendmsg+0x6b/0x80
      [  807.589015]  ___sys_sendmsg+0x4a1/0x520
      [  807.593230]  ? copy_msghdr_from_user+0x210/0x210
      [  807.598232]  ? do_wp_page+0x174/0x880
      [  807.602276]  ? __handle_mm_fault+0x749/0x1c10
      [  807.607021]  ? __handle_mm_fault+0x1046/0x1c10
      [  807.611849]  ? __pmd_alloc+0x320/0x320
      [  807.615973]  ? check_chain_key+0x140/0x1f0
      [  807.620450]  ? check_chain_key+0x140/0x1f0
      [  807.624929]  ? __fget_light+0xbc/0xd0
      [  807.628970]  ? __sys_sendmsg+0xd7/0x150
      [  807.633172]  __sys_sendmsg+0xd7/0x150
      [  807.637201]  ? __ia32_sys_shutdown+0x30/0x30
      [  807.641846]  ? up_read+0x53/0x90
      [  807.645442]  ? __do_page_fault+0x484/0x780
      [  807.649949]  ? do_syscall_64+0x1e/0x2c0
      [  807.654164]  do_syscall_64+0x72/0x2c0
      [  807.658198]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  807.663625] RIP: 0033:0x7f42e9870150
      [  807.667568] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
      [  807.687328] RSP: 002b:00007ffdbf595b58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [  807.695564] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f42e9870150
      [  807.703083] RDX: 0000000000000000 RSI: 00007ffdbf595b80 RDI: 0000000000000003
      [  807.710605] RBP: 00007ffdbf599d90 R08: 0000000000679bc0 R09: 000000000000000f
      [  807.718127] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffdbf599d88
      [  807.725651] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      
      [  807.735048] Allocated by task 2687:
      [  807.738902]  kasan_kmalloc+0xa0/0xd0
      [  807.742852]  __kmalloc+0x118/0x2d0
      [  807.746615]  tcf_idr_create+0x44/0x320
      [  807.750738]  tcf_nat_init+0x41e/0x530 [act_nat]
      [  807.755638]  tcf_action_init_1+0x4e0/0x650
      [  807.760104]  tcf_action_init+0x1ce/0x2d0
      [  807.764395]  tcf_exts_validate+0x1d8/0x200
      [  807.768861]  fl_change+0x55a/0x26b4 [cls_flower]
      [  807.773845]  tc_new_tfilter+0x748/0xa20
      [  807.778051]  rtnetlink_rcv_msg+0x56a/0x6d0
      [  807.782517]  netlink_rcv_skb+0x18d/0x200
      [  807.786804]  netlink_unicast+0x2d0/0x370
      [  807.791095]  netlink_sendmsg+0x3b9/0x6a0
      [  807.795387]  sock_sendmsg+0x6b/0x80
      [  807.799240]  ___sys_sendmsg+0x4a1/0x520
      [  807.803445]  __sys_sendmsg+0xd7/0x150
      [  807.807473]  do_syscall_64+0x72/0x2c0
      [  807.811506]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [  807.818776] Freed by task 2728:
      [  807.822283]  __kasan_slab_free+0x122/0x180
      [  807.826752]  kfree+0xf4/0x2f0
      [  807.830080]  __tcf_action_put+0x5a/0xb0
      [  807.834281]  tcf_action_put_many+0x46/0x70
      [  807.838747]  tca_action_gd+0x232/0xc40
      [  807.842862]  tc_ctl_action+0x215/0x230
      [  807.846977]  rtnetlink_rcv_msg+0x56a/0x6d0
      [  807.851444]  netlink_rcv_skb+0x18d/0x200
      [  807.855731]  netlink_unicast+0x2d0/0x370
      [  807.860021]  netlink_sendmsg+0x3b9/0x6a0
      [  807.864312]  sock_sendmsg+0x6b/0x80
      [  807.868166]  ___sys_sendmsg+0x4a1/0x520
      [  807.872372]  __sys_sendmsg+0xd7/0x150
      [  807.876401]  do_syscall_64+0x72/0x2c0
      [  807.880431]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      [  807.887704] The buggy address belongs to the object at ffff88033e636000
                      which belongs to the cache kmalloc-256 of size 256
      [  807.900909] The buggy address is located 0 bytes inside of
                      256-byte region [ffff88033e636000, ffff88033e636100)
      [  807.913155] The buggy address belongs to the page:
      [  807.918322] page:ffffea000cf98d80 count:1 mapcount:0 mapping:ffff88036f80ee00 index:0x0 compound_mapcount: 0
      [  807.928831] flags: 0x5fff8000008100(slab|head)
      [  807.933647] raw: 005fff8000008100 ffffea000db44f00 0000000400000004 ffff88036f80ee00
      [  807.942050] raw: 0000000000000000 0000000080190019 00000001ffffffff 0000000000000000
      [  807.950456] page dumped because: kasan: bad access detected
      
      [  807.958240] Memory state around the buggy address:
      [  807.963405]  ffff88033e635f00: fc fc fc fc fb fb fb fb fb fb fb fc fc fc fc fb
      [  807.971288]  ffff88033e635f80: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
      [  807.979166] >ffff88033e636000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  807.994882]                    ^
      [  807.998477]  ffff88033e636080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  808.006352]  ffff88033e636100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
      [  808.014230] ==================================================================
      [  808.022108] Disabling lock debugging due to kernel taint
      
      Fixes: edfaf94f ("net_sched: improve and refactor tcf_action_put_many()")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c10bbfae
    • H
      ip6_tunnel: respect ttl inherit for ip6tnl · 36feaac3
      Hangbin Liu 提交于
      man ip-tunnel ttl section says:
      0 is a special value meaning that packets inherit the TTL value.
      
      IPv4 tunnel respect this in ip_tunnel_xmit(), but IPv6 tunnel has not
      implement it yet. To make IPv6 behave consistently with IP tunnel,
      add ipv6 tunnel inherit support.
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36feaac3
  10. 03 9月, 2018 11 次提交
    • E
      mac80211: shorten the IBSS debug messages · c6e57b38
      Emmanuel Grumbach 提交于
      When tracing is enabled, all the debug messages are recorded and must
      not exceed MAX_MSG_LEN (100) columns. Longer debug messages grant the
      user with:
      
      WARNING: CPU: 3 PID: 32642 at /tmp/wifi-core-20180806094828/src/iwlwifi-stack-dev/net/mac80211/./trace_msg.h:32 trace_event_raw_event_mac80211_msg_event+0xab/0xc0 [mac80211]
      Workqueue: phy1 ieee80211_iface_work [mac80211]
       RIP: 0010:trace_event_raw_event_mac80211_msg_event+0xab/0xc0 [mac80211]
       Call Trace:
        __sdata_dbg+0xbd/0x120 [mac80211]
        ieee80211_ibss_rx_queued_mgmt+0x15f/0x510 [mac80211]
        ieee80211_iface_work+0x21d/0x320 [mac80211]
      Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      c6e57b38
    • E
      mac80211: don't Tx a deauth frame if the AP forbade Tx · 6c18b27d
      Emmanuel Grumbach 提交于
      If the driver fails to properly prepare for the channel
      switch, mac80211 will disconnect. If the CSA IE had mode
      set to 1, it means that the clients are not allowed to send
      any Tx on the current channel, and that includes the
      deauthentication frame.
      
      Make sure that we don't send the deauthentication frame in
      this case.
      
      In iwlwifi, this caused a failure to flush queues since the
      firmware already closed the queues after having parsed the
      CSA IE. Then mac80211 would wait until the deauthentication
      frame would go out (drv_flush(drop=false)) and that would
      never happen.
      Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      6c18b27d
    • I
      mac80211: Fix station bandwidth setting after channel switch · 0007e943
      Ilan Peer 提交于
      When performing a channel switch flow for a managed interface, the
      flow did not update the bandwidth of the AP station and the rate
      scale algorithm. In case of a channel width downgrade, this would
      result with the rate scale algorithm using a bandwidth that does not
      match the interface channel configuration.
      
      Fix this by updating the AP station bandwidth and rate scaling algorithm
      before the actual channel change in case of a bandwidth downgrade, or
      after the actual channel change in case of a bandwidth upgrade.
      Signed-off-by: NIlan Peer <ilan.peer@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      0007e943
    • E
      mac80211: fix a race between restart and CSA flows · f3ffb6c3
      Emmanuel Grumbach 提交于
      We hit a problem with iwlwifi that was caused by a bug in
      mac80211. A bug in iwlwifi caused the firwmare to crash in
      certain cases in channel switch. Because of that bug,
      drv_pre_channel_switch would fail and trigger the restart
      flow.
      Now we had the hw restart worker which runs on the system's
      workqueue and the csa_connection_drop_work worker that runs
      on mac80211's workqueue that can run together. This is
      obviously problematic since the restart work wants to
      reconfigure the connection, while the csa_connection_drop_work
      worker does the exact opposite: it tries to disconnect.
      
      Fix this by cancelling the csa_connection_drop_work worker
      in the restart worker.
      
      Note that this can sound racy: we could have:
      
      driver   iface_work   CSA_work   restart_work
      +++++++++++++++++++++++++++++++++++++++++++++
                    |
       <--drv_cs ---|
      <FW CRASH!>
      -CS FAILED-->
                    |                       |
                    |                 cancel_work(CSA)
                 schedule                   |
                 CSA work                   |
                               |            |
                              Race between those 2
      
      But this is not possible because we flush the workqueue
      in the restart worker before we cancel the CSA worker.
      That would be bullet proof if we could guarantee that
      we schedule the CSA worker only from the iface_work
      which runs on the workqueue (and not on the system's
      workqueue), but unfortunately we do have an instance
      in which we schedule the CSA work outside the context
      of the workqueue (ieee80211_chswitch_done).
      
      Note also that we should probably cancel other workers
      like beacon_connection_loss_work and possibly others
      for different types of interfaces, at the very least,
      IBSS should suffer from the exact same problem, but for
      now, do the minimum to fix the actual bug that was actually
      experienced and reproduced.
      Signed-off-by: NEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      f3ffb6c3
    • D
      mac80211: fix WMM TXOP calculation · abd76d25
      Dreyfuss, Haim 提交于
      In commit 9236c4523e5b ("mac80211: limit wmm params to comply
      with ETSI requirements"), we have limited the WMM parameters to
      comply with 802.11 and ETSI standard.  Mistakenly the TXOP value
      was caluclated wrong.  Fix it by taking the minimum between
      802.11 to ETSI to make sure we are not violating both.
      
      Fixes: e552af05 ("mac80211: limit wmm params to comply with ETSI requirements")
      Signed-off-by: NHaim Dreyfuss <haim.dreyfuss@intel.com>
      Signed-off-by: NLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      abd76d25
    • D
      cfg80211: fix a type issue in ieee80211_chandef_to_operating_class() · 8442938c
      Dan Carpenter 提交于
      The "chandef->center_freq1" variable is a u32 but "freq" is a u16 so we
      are truncating away the high bits.  I noticed this bug because in commit
      9cf0a0b4b64a ("cfg80211: Add support for 60GHz band channels 5 and 6")
      we made "freq <= 56160 + 2160 * 6" a valid requency when before it was
      only "freq <= 56160 + 2160 * 4" that was valid.  It introduces a static
      checker warning:
      
          net/wireless/util.c:1571 ieee80211_chandef_to_operating_class()
          warn: always true condition '(freq <= 56160 + 2160 * 6) => (0-u16max <= 69120)'
      
      But really we probably shouldn't have been truncating the high bits
      away to begin with.
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      8442938c
    • L
      mac80211: fix an off-by-one issue in A-MSDU max_subframe computation · 66eb02d8
      Lorenzo Bianconi 提交于
      Initialize 'n' to 2 in order to take into account also the first
      packet in the estimation of max_subframe limit for a given A-MSDU
      since frag_tail pointer is NULL when ieee80211_amsdu_aggregate
      routine analyzes the second frame.
      
      Fixes: 6e0456b5 ("mac80211: add A-MSDU tx support")
      Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      66eb02d8
    • D
      net/ipv6: Only update MTU metric if it set · 15a81b41
      David Ahern 提交于
      Jan reported a regression after an update to 4.18.5. In this case ipv6
      default route is setup by systemd-networkd based on data from an RA. The
      RA contains an MTU of 1492 which is used when the route is first inserted
      but then systemd-networkd pushes down updates to the default route
      without the mtu set.
      
      Prior to the change to fib6_info, metrics such as MTU were held in the
      dst_entry and rt6i_pmtu in rt6_info contained an update to the mtu if
      any. ip6_mtu would look at rt6i_pmtu first and use it if set. If not,
      the value from the metrics is used if it is set and finally falling
      back to the idev value.
      
      After the fib6_info change metrics are contained in the fib6_info struct
      and there is no equivalent to rt6i_pmtu. To maintain consistency with
      the old behavior the new code should only reset the MTU in the metrics
      if the route update has it set.
      
      Fixes: d4ead6b3 ("net/ipv6: move metrics from dst to rt6_info")
      Reported-by: NJan Janssen <medhefgo@web.de>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      15a81b41
    • H
      igmp: fix incorrect unsolicit report count after link down and up · ff06525f
      Hangbin Liu 提交于
      After link down and up, i.e. when call ip_mc_up(), we doesn't init
      im->unsolicit_count. So after igmp_timer_expire(), we will not start
      timer again and only send one unsolicit report at last.
      
      Fix it by initializing im->unsolicit_count in igmp_group_added(), so
      we can respect igmp robustness value.
      
      Fixes: 24803f38 ("igmp: do not remove igmp souce list info when set link down")
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ff06525f
    • H
      igmp: fix incorrect unsolicit report count when join group · 4fb7253e
      Hangbin Liu 提交于
      We should not start timer if im->unsolicit_count equal to 0 after decrease.
      Or we will send one more unsolicit report message. i.e. 3 instead of 2 by
      default.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4fb7253e
    • T
      bpf: Fix bpf_msg_pull_data() · 9db39f4d
      Tushar Dave 提交于
      Helper bpf_msg_pull_data() mistakenly reuses variable 'offset' while
      linearizing multiple scatterlist elements. Variable 'offset' is used
      to find first starting scatterlist element
          i.e. msg->data = sg_virt(&sg[first_sg]) + start - offset"
      
      Use different variable name while linearizing multiple scatterlist
      elements so that value contained in variable 'offset' won't get
      overwritten.
      
      Fixes: 015632bb ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
      Signed-off-by: NTushar Dave <tushar.n.dave@oracle.com>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      9db39f4d
  11. 02 9月, 2018 1 次提交
    • A
      ipv6: don't get lwtstate twice in ip6_rt_copy_init() · 93bbadd6
      Alexey Kodanev 提交于
      Commit 80f1a0f4 ("net/ipv6: Put lwtstate when destroying fib6_info")
      partially fixed the kmemleak [1], lwtstate can be copied from fib6_info,
      with ip6_rt_copy_init(), and it should be done only once there.
      
      rt->dst.lwtstate is set by ip6_rt_init_dst(), at the start of the function
      ip6_rt_copy_init(), so there is no need to get it again at the end.
      
      With this patch, lwtstate also isn't copied from RTF_REJECT routes.
      
      [1]:
      unreferenced object 0xffff880b6aaa14e0 (size 64):
        comm "ip", pid 10577, jiffies 4295149341 (age 1273.903s)
        hex dump (first 32 bytes):
          01 00 04 00 04 00 00 00 10 00 00 00 00 00 00 00  ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<0000000018664623>] lwtunnel_build_state+0x1bc/0x420
          [<00000000b73aa29a>] ip6_route_info_create+0x9f7/0x1fd0
          [<00000000ee2c5d1f>] ip6_route_add+0x14/0x70
          [<000000008537b55c>] inet6_rtm_newroute+0xd9/0xe0
          [<000000002acc50f5>] rtnetlink_rcv_msg+0x66f/0x8e0
          [<000000008d9cd381>] netlink_rcv_skb+0x268/0x3b0
          [<000000004c893c76>] netlink_unicast+0x417/0x5a0
          [<00000000f2ab1afb>] netlink_sendmsg+0x70b/0xc30
          [<00000000890ff0aa>] sock_sendmsg+0xb1/0xf0
          [<00000000a2e7b66f>] ___sys_sendmsg+0x659/0x950
          [<000000001e7426c8>] __sys_sendmsg+0xde/0x170
          [<00000000fe411443>] do_syscall_64+0x9f/0x4a0
          [<000000001be7b28b>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<000000006d21f353>] 0xffffffffffffffff
      
      Fixes: 6edb3c96 ("net/ipv6: Defer initialization of dst to data path")
      Signed-off-by: NAlexey Kodanev <alexey.kodanev@oracle.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      93bbadd6
  12. 01 9月, 2018 1 次提交
    • F
      tcp: do not restart timewait timer on rst reception · 63cc357f
      Florian Westphal 提交于
      RFC 1337 says:
       ''Ignore RST segments in TIME-WAIT state.
         If the 2 minute MSL is enforced, this fix avoids all three hazards.''
      
      So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk
      expire rather than removing it instantly when a reset is received.
      
      However, Linux will also re-start the TIME-WAIT timer.
      
      This causes connect to fail when tying to re-use ports or very long
      delays (until syn retry interval exceeds MSL).
      
      packetdrill test case:
      // Demonstrate bogus rearming of TIME-WAIT timer in rfc1337 mode.
      `sysctl net.ipv4.tcp_rfc1337=1`
      
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      0.000 bind(3, ..., ...) = 0
      0.000 listen(3, 1) = 0
      
      0.100 < S 0:0(0) win 29200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      0.200 < . 1:1(0) ack 1 win 257
      0.200 accept(3, ..., ...) = 4
      
      // Receive first segment
      0.310 < P. 1:1001(1000) ack 1 win 46
      
      // Send one ACK
      0.310 > . 1:1(0) ack 1001
      
      // read 1000 byte
      0.310 read(4, ..., 1000) = 1000
      
      // Application writes 100 bytes
      0.350 write(4, ..., 100) = 100
      0.350 > P. 1:101(100) ack 1001
      
      // ACK
      0.500 < . 1001:1001(0) ack 101 win 257
      
      // close the connection
      0.600 close(4) = 0
      0.600 > F. 101:101(0) ack 1001 win 244
      
      // Our side is in FIN_WAIT_1 & waits for ack to fin
      0.7 < . 1001:1001(0) ack 102 win 244
      
      // Our side is in FIN_WAIT_2 with no outstanding data.
      0.8 < F. 1001:1001(0) ack 102 win 244
      0.8 > . 102:102(0) ack 1002 win 244
      
      // Our side is now in TIME_WAIT state, send ack for fin.
      0.9 < F. 1002:1002(0) ack 102 win 244
      0.9 > . 102:102(0) ack 1002 win 244
      
      // Peer reopens with in-window SYN:
      1.000 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      
      // Therefore, reply with ACK.
      1.000 > . 102:102(0) ack 1002 win 244
      
      // Peer sends RST for this ACK.  Normally this RST results
      // in tw socket removal, but rfc1337=1 setting prevents this.
      1.100 < R 1002:1002(0) win 244
      
      // second syn. Due to rfc1337=1 expect another pure ACK.
      31.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      31.0 > . 102:102(0) ack 1002 win 244
      
      // .. and another RST from peer.
      31.1 < R 1002:1002(0) win 244
      31.2 `echo no timer restart;ss -m -e -a -i -n -t -o state TIME-WAIT`
      
      // third syn after one minute.  Time-Wait socket should have expired by now.
      63.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      
      // so we expect a syn-ack & 3whs to proceed from here on.
      63.0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
      
      Without this patch, 'ss' shows restarts of tw timer and last packet is
      thus just another pure ack, more than one minute later.
      
      This restores the original code from commit 283fd6cf0be690a83
      ("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git .
      
      For some reason the else branch was removed/lost in 1f28b683339f7
      ("Merge in TCP/UDP optimizations and [..]") and timer restart became
      unconditional.
      Reported-by: NMichal Tesar <mtesar@redhat.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      63cc357f