1. 10 5月, 2019 1 次提交
  2. 05 5月, 2019 1 次提交
    • X
      sctp: avoid running the sctp state machine recursively · b563e9bb
      Xin Long 提交于
      [ Upstream commit fbd019737d71e405f86549fd738f81e2ff3dd073 ]
      
      Ying triggered a call trace when doing an asconf testing:
      
        BUG: scheduling while atomic: swapper/12/0/0x10000100
        Call Trace:
         <IRQ>  [<ffffffffa4375904>] dump_stack+0x19/0x1b
         [<ffffffffa436fcaf>] __schedule_bug+0x64/0x72
         [<ffffffffa437b93a>] __schedule+0x9ba/0xa00
         [<ffffffffa3cd5326>] __cond_resched+0x26/0x30
         [<ffffffffa437bc4a>] _cond_resched+0x3a/0x50
         [<ffffffffa3e22be8>] kmem_cache_alloc_node+0x38/0x200
         [<ffffffffa423512d>] __alloc_skb+0x5d/0x2d0
         [<ffffffffc0995320>] sctp_packet_transmit+0x610/0xa20 [sctp]
         [<ffffffffc098510e>] sctp_outq_flush+0x2ce/0xc00 [sctp]
         [<ffffffffc098646c>] sctp_outq_uncork+0x1c/0x20 [sctp]
         [<ffffffffc0977338>] sctp_cmd_interpreter.isra.22+0xc8/0x1460 [sctp]
         [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp]
         [<ffffffffc099443d>] sctp_primitive_ASCONF+0x3d/0x50 [sctp]
         [<ffffffffc0977384>] sctp_cmd_interpreter.isra.22+0x114/0x1460 [sctp]
         [<ffffffffc0976ad1>] sctp_do_sm+0xe1/0x350 [sctp]
         [<ffffffffc097b3a4>] sctp_assoc_bh_rcv+0xf4/0x1b0 [sctp]
         [<ffffffffc09840f1>] sctp_inq_push+0x51/0x70 [sctp]
         [<ffffffffc099732b>] sctp_rcv+0xa8b/0xbd0 [sctp]
      
      As it shows, the first sctp_do_sm() running under atomic context (NET_RX
      softirq) invoked sctp_primitive_ASCONF() that uses GFP_KERNEL flag later,
      and this flag is supposed to be used in non-atomic context only. Besides,
      sctp_do_sm() was called recursively, which is not expected.
      
      Vlad tried to fix this recursive call in Commit c0786693 ("sctp: Fix
      oops when sending queued ASCONF chunks") by introducing a new command
      SCTP_CMD_SEND_NEXT_ASCONF. But it didn't work as this command is still
      used in the first sctp_do_sm() call, and sctp_primitive_ASCONF() will
      be called in this command again.
      
      To avoid calling sctp_do_sm() recursively, we send the next queued ASCONF
      not by sctp_primitive_ASCONF(), but by sctp_sf_do_prm_asconf() in the 1st
      sctp_do_sm() directly.
      Reported-by: NYing Xu <yinxu@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b563e9bb
  3. 04 5月, 2019 2 次提交
  4. 02 5月, 2019 6 次提交
    • Y
      net: netrom: Fix error cleanup path of nr_proto_init · e30203e4
      YueHaibing 提交于
      commit d3706566ae3d92677b932dd156157fd6c72534b1 upstream.
      
      Syzkaller report this:
      
      BUG: unable to handle kernel paging request at fffffbfff830524b
      PGD 237fe8067 P4D 237fe8067 PUD 237e64067 PMD 1c9716067 PTE 0
      Oops: 0000 [#1] SMP KASAN PTI
      CPU: 1 PID: 4465 Comm: syz-executor.0 Not tainted 5.0.0+ #5
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      RIP: 0010:__list_add_valid+0x21/0xe0 lib/list_debug.c:23
      Code: 8b 0c 24 e9 17 fd ff ff 90 55 48 89 fd 48 8d 7a 08 53 48 89 d3 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 0f 85 8b 00 00 00 48 8b 53 08 48 39 f2 75 35 48 89 f2
      RSP: 0018:ffff8881ea2278d0 EFLAGS: 00010282
      RAX: dffffc0000000000 RBX: ffffffffc1829250 RCX: 1ffff1103d444ef4
      RDX: 1ffffffff830524b RSI: ffffffff85659300 RDI: ffffffffc1829258
      RBP: ffffffffc1879250 R08: fffffbfff0acb269 R09: fffffbfff0acb269
      R10: ffff8881ea2278f0 R11: fffffbfff0acb268 R12: ffffffffc1829250
      R13: dffffc0000000000 R14: 0000000000000008 R15: ffffffffc187c830
      FS:  00007fe0361df700(0000) GS:ffff8881f7300000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: fffffbfff830524b CR3: 00000001eb39a001 CR4: 00000000007606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      PKRU: 55555554
      Call Trace:
       __list_add include/linux/list.h:60 [inline]
       list_add include/linux/list.h:79 [inline]
       proto_register+0x444/0x8f0 net/core/sock.c:3375
       nr_proto_init+0x73/0x4b3 [netrom]
       ? 0xffffffffc1628000
       ? 0xffffffffc1628000
       do_one_initcall+0xbc/0x47d init/main.c:887
       do_init_module+0x1b5/0x547 kernel/module.c:3456
       load_module+0x6405/0x8c10 kernel/module.c:3804
       __do_sys_finit_module+0x162/0x190 kernel/module.c:3898
       do_syscall_64+0x9f/0x450 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x462e99
      Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007fe0361dec58 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000462e99
      RDX: 0000000000000000 RSI: 0000000020000100 RDI: 0000000000000003
      RBP: 00007fe0361dec70 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe0361df6bc
      R13: 00000000004bcefa R14: 00000000006f6fb0 R15: 0000000000000004
      Modules linked in: netrom(+) ax25 fcrypt pcbc af_alg arizona_ldo1 v4l2_common videodev media v4l2_dv_timings hdlc ide_cd_mod snd_soc_sigmadsp_regmap snd_soc_sigmadsp intel_spi_platform intel_spi mtd spi_nor snd_usbmidi_lib usbcore lcd ti_ads7950 hi6421_regulator snd_soc_kbl_rt5663_max98927 snd_soc_hdac_hdmi snd_hda_ext_core snd_hda_core snd_soc_rt5663 snd_soc_core snd_pcm_dmaengine snd_compress snd_soc_rl6231 mac80211 rtc_rc5t583 spi_slave_time leds_pwm hid_gt683r hid industrialio_triggered_buffer kfifo_buf industrialio ir_kbd_i2c rc_core led_class_flash dwc_xlgmac snd_ymfpci gameport snd_mpu401_uart snd_rawmidi snd_ac97_codec snd_pcm ac97_bus snd_opl3_lib snd_timer snd_seq_device snd_hwdep snd soundcore iptable_security iptable_raw iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel hsr veth netdevsim vxcan batman_adv cfg80211 rfkill chnl_net caif nlmon dummy team bonding vcan
       bridge stp llc ip6_gre gre ip6_tunnel tunnel6 tun joydev mousedev ppdev tpm kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ide_pci_generic piix aesni_intel aes_x86_64 crypto_simd cryptd glue_helper ide_core psmouse input_leds i2c_piix4 serio_raw intel_agp intel_gtt ata_generic agpgart pata_acpi parport_pc rtc_cmos parport floppy sch_fq_codel ip_tables x_tables sha1_ssse3 sha1_generic ipv6 [last unloaded: rxrpc]
      Dumping ftrace buffer:
         (ftrace buffer empty)
      CR2: fffffbfff830524b
      ---[ end trace 039ab24b305c4b19 ]---
      
      If nr_proto_init failed, it may forget to call proto_unregister,
      tiggering this issue.This patch rearrange code of nr_proto_init
      to avoid such issues.
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e30203e4
    • P
      netfilter: nf_tables: bogus EBUSY when deleting set after flush · e313d5da
      Pablo Neira Ayuso 提交于
      [ Upstream commit 273fe3f1006ea5ebc63d6729e43e8e45e32b256a ]
      
      Set deletion after flush coming in the same batch results in EBUSY. Add
      set use counter to track the number of references to this set from
      rules. We cannot rely on the list of bindings for this since such list
      is still populated from the preparation phase.
      Reported-by: NVáclav Zindulka <vaclav.zindulka@tlapnet.cz>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      e313d5da
    • P
      netfilter: nf_tables: fix set double-free in abort path · 25ddad73
      Pablo Neira Ayuso 提交于
      [ Upstream commit 40ba1d9b4d19796afc9b7ece872f5f3e8f5e2c13 ]
      
      The abort path can cause a double-free of an anonymous set.
      Added-and-to-be-aborted rule looks like this:
      
      udp dport { 137, 138 } drop
      
      The to-be-aborted transaction list looks like this:
      
      newset
      newsetelem
      newsetelem
      rule
      
      This gets walked in reverse order, so first pass disables the rule, the
      set elements, then the set.
      
      After synchronize_rcu(), we then destroy those in same order: rule, set
      element, set element, newset.
      
      Problem is that the anonymous set has already been bound to the rule, so
      the rule (lookup expression destructor) already frees the set, when then
      cause use-after-free when trying to delete the elements from this set,
      then try to free the set again when handling the newset expression.
      
      Rule releases the bound set in first place from the abort path, this
      causes the use-after-free on set element removal when undoing the new
      element transactions. To handle this, skip new element transaction if
      set is bound from the abort path.
      
      This is still causes the use-after-free on set element removal.  To
      handle this, remove transaction from the list when the set is already
      bound.
      
      Joint work with Florian Westphal.
      
      Fixes: f6ac85858976 ("netfilter: nf_tables: unbind set in rule from commit path")
      Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1325Acked-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      25ddad73
    • P
      netfilter: nft_compat: use .release_ops and remove list of extension · 8906234c
      Pablo Neira Ayuso 提交于
      [ Upstream commit b8e204006340b7aaf32bd2b9806c692f6e0cb38a ]
      
      Add .release_ops, that is called in case of error at a later stage in
      the expression initialization path, ie. .select_ops() has been already
      set up operations and that needs to be undone. This allows us to unwind
      .select_ops from the error path, ie. release the dynamic operations for
      this extension.
      
      Moreover, allocate one single operation instead of recycling them, this
      comes at the cost of consuming a bit more memory per rule, but it
      simplifies the infrastructure.
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      8906234c
    • P
      netfilter: nf_tables: unbind set in rule from commit path · af26f3e2
      Pablo Neira Ayuso 提交于
      Anonymous sets that are bound to rules from the same transaction trigger
      a kernel splat from the abort path due to double set list removal and
      double free.
      
      This patch updates the logic to search for the transaction that is
      responsible for creating the set and disable the set list removal and
      release, given the rule is now responsible for this. Lookup is reverse
      since the transaction that adds the set is likely to be at the tail of
      the list.
      
      Moreover, this patch adds the unbind step to deliver the event from the
      commit path.  This should not be done from the worker thread, since we
      have no guarantees of in-order delivery to the listener.
      
      This patch removes the assumption that both activate and deactivate
      callbacks need to be provided.
      
      Fixes: cd5125d8f518 ("netfilter: nf_tables: split set destruction in deactivate and destroy phase")
      Reported-by: NMikhail Morfikov <mmorfikov@gmail.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      af26f3e2
    • F
      netfilter: nf_tables: split set destruction in deactivate and destroy phase · 3dbba8eb
      Florian Westphal 提交于
      [ Upstream commit cd5125d8f51882279f50506bb9c7e5e89dc9bef3 ]
      
      Splits unbind_set into destroy_set and unbinding operation.
      
      Unbinding removes set from lists (so new transaction would not
      find it anymore) but keeps memory allocated (so packet path continues
      to work).
      
      Rebind function is added to allow unrolling in case transaction
      that wants to remove set is aborted.
      
      Destroy function is added to free the memory, but this could occur
      outside of transaction in the future.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      3dbba8eb
  5. 27 4月, 2019 3 次提交
  6. 20 4月, 2019 2 次提交
    • C
      xfrm: destroy xfrm_state synchronously on net exit path · bbbe4746
      Cong Wang 提交于
      [ Upstream commit f75a2804da391571563c4b6b29e7797787332673 ]
      
      xfrm_state_put() moves struct xfrm_state to the GC list
      and schedules the GC work to clean it up. On net exit call
      path, xfrm_state_flush() is called to clean up and
      xfrm_flush_gc() is called to wait for the GC work to complete
      before exit.
      
      However, this doesn't work because one of the ->destructor(),
      ipcomp_destroy(), schedules the same GC work again inside
      the GC work. It is hard to wait for such a nested async
      callback. This is also why syzbot still reports the following
      warning:
      
       WARNING: CPU: 1 PID: 33 at net/ipv6/xfrm6_tunnel.c:351 xfrm6_tunnel_net_exit+0x2cb/0x500 net/ipv6/xfrm6_tunnel.c:351
       ...
        ops_exit_list.isra.0+0xb0/0x160 net/core/net_namespace.c:153
        cleanup_net+0x51d/0xb10 net/core/net_namespace.c:551
        process_one_work+0xd0c/0x1ce0 kernel/workqueue.c:2153
        worker_thread+0x143/0x14a0 kernel/workqueue.c:2296
        kthread+0x357/0x430 kernel/kthread.c:246
        ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
      
      In fact, it is perfectly fine to bypass GC and destroy xfrm_state
      synchronously on net exit call path, because it is in process context
      and doesn't need a work struct to do any blocking work.
      
      This patch introduces xfrm_state_put_sync() which simply bypasses
      GC, and lets its callers to decide whether to use this synchronous
      version. On net exit path, xfrm_state_fini() and
      xfrm6_tunnel_net_exit() use it. And, as ipcomp_destroy() itself is
      blocking, it can use xfrm_state_put_sync() directly too.
      
      Also rename xfrm_state_gc_destroy() to ___xfrm_state_destroy() to
      reflect this change.
      
      Fixes: b48c05ab ("xfrm: Fix warning in xfrm6_tunnel_net_exit.")
      Reported-and-tested-by: syzbot+e9aebef558e3ed673934@syzkaller.appspotmail.com
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      bbbe4746
    • M
      Bluetooth: Fix debugfs NULL pointer dereference · 09b6c080
      Matias Karhumaa 提交于
      [ Upstream commit 30d65e0804d58a03d1a8ea4e12c6fc07ed08218b ]
      
      Fix crash caused by NULL pointer dereference when debugfs functions
      le_max_key_read, le_max_key_size_write, le_min_key_size_read or
      le_min_key_size_write and Bluetooth adapter was powered off.
      
      Fix is to move max_key_size and min_key_size from smp_dev to hci_dev.
      At the same time they were renamed to le_max_key_size and
      le_min_key_size.
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000002e8
      PGD 0 P4D 0
      Oops: 0000 [#24] SMP PTI
      CPU: 2 PID: 6255 Comm: cat Tainted: G      D    OE     4.18.9-200.fc28.x86_64 #1
      Hardware name: LENOVO 4286CTO/4286CTO, BIOS 8DET76WW (1.46 ) 06/21/2018
      RIP: 0010:le_max_key_size_read+0x45/0xb0 [bluetooth]
      Code: 00 00 00 48 83 ec 10 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 48 8b 87 c8 00 00 00 48 8d 7c 24 04 48 8b 80 48 0a 00 00 <48> 8b 80 e8 02 00 00 0f b6 48 52 e8 fb b6 b3 ed be 04 00 00 00 48
      RSP: 0018:ffffab23c3ff3df0 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 00007f0b4ca2e000 RCX: ffffab23c3ff3f08
      RDX: ffffffffc0ddb033 RSI: 0000000000000004 RDI: ffffab23c3ff3df4
      RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffab23c3ff3ed8 R11: 0000000000000000 R12: ffffab23c3ff3f08
      R13: 00007f0b4ca2e000 R14: 0000000000020000 R15: ffffab23c3ff3f08
      FS:  00007f0b4ca0f540(0000) GS:ffff91bd5e280000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000002e8 CR3: 00000000629fa006 CR4: 00000000000606e0
      Call Trace:
       full_proxy_read+0x53/0x80
       __vfs_read+0x36/0x180
       vfs_read+0x8a/0x140
       ksys_read+0x4f/0xb0
       do_syscall_64+0x5b/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: NMatias Karhumaa <matias.karhumaa@gmail.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      09b6c080
  7. 17 4月, 2019 2 次提交
  8. 06 4月, 2019 1 次提交
    • F
      netfilter: physdev: relax br_netfilter dependency · 2e6bcc32
      Florian Westphal 提交于
      [ Upstream commit 8e2f311a68494a6677c1724bdcb10bada21af37c ]
      
      Following command:
        iptables -D FORWARD -m physdev ...
      causes connectivity loss in some setups.
      
      Reason is that iptables userspace will probe kernel for the module revision
      of the physdev patch, and physdev has an artificial dependency on
      br_netfilter (xt_physdev use makes no sense unless a br_netfilter module
      is loaded).
      
      This causes the "phydev" module to be loaded, which in turn enables the
      "call-iptables" infrastructure.
      
      bridged packets might then get dropped by the iptables ruleset.
      
      The better fix would be to change the "call-iptables" defaults to 0 and
      enforce explicit setting to 1, but that breaks backwards compatibility.
      
      This does the next best thing: add a request_module call to checkentry.
      This was a stray '-D ... -m physdev' won't activate br_netfilter
      anymore.
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      2e6bcc32
  9. 03 4月, 2019 2 次提交
    • X
      sctp: get sctphdr by offset in sctp_compute_cksum · 97265479
      Xin Long 提交于
      [ Upstream commit 273160ffc6b993c7c91627f5a84799c66dfe4dee ]
      
      sctp_hdr(skb) only works when skb->transport_header is set properly.
      
      But in Netfilter, skb->transport_header for ipv6 is not guaranteed
      to be right value for sctphdr. It would cause to fail to check the
      checksum for sctp packets.
      
      So fix it by using offset, which is always right in all places.
      
      v1->v2:
        - Fix the changelog.
      
      Fixes: e6d8b64b ("net: sctp: fix and consolidate SCTP checksumming code")
      Reported-by: NLi Shuang <shuali@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97265479
    • M
      packets: Always register packet sk in the same order · 69cea7cf
      Maxime Chevallier 提交于
      [ Upstream commit a4dc6a49156b1f8d6e17251ffda17c9e6a5db78a ]
      
      When using fanouts with AF_PACKET, the demux functions such as
      fanout_demux_cpu will return an index in the fanout socket array, which
      corresponds to the selected socket.
      
      The ordering of this array depends on the order the sockets were added
      to a given fanout group, so for FANOUT_CPU this means sockets are bound
      to cpus in the order they are configured, which is OK.
      
      However, when stopping then restarting the interface these sockets are
      bound to, the sockets are reassigned to the fanout group in the reverse
      order, due to the fact that they were inserted at the head of the
      interface's AF_PACKET socket list.
      
      This means that traffic that was directed to the first socket in the
      fanout group is now directed to the last one after an interface restart.
      
      In the case of FANOUT_CPU, traffic from CPU0 will be directed to the
      socket that used to receive traffic from the last CPU after an interface
      restart.
      
      This commit introduces a helper to add a socket at the tail of a list,
      then uses it to register AF_PACKET sockets.
      
      Note that this changes the order in which sockets are listed in /proc and
      with sock_diag.
      
      Fixes: dc99f600 ("packet: Add fanout support")
      Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69cea7cf
  10. 24 3月, 2019 1 次提交
    • A
      phonet: fix building with clang · ee01ac61
      Arnd Bergmann 提交于
      [ Upstream commit 6321aa197547da397753757bd84c6ce64b3e3d89 ]
      
      clang warns about overflowing the data[] member in the struct pnpipehdr:
      
      net/phonet/pep.c:295:8: warning: array index 4 is past the end of the array (which contains 1 element) [-Warray-bounds]
                              if (hdr->data[4] == PEP_IND_READY)
                                  ^         ~
      include/net/phonet/pep.h:66:3: note: array 'data' declared here
                      u8              data[1];
      
      Using a flexible array member at the end of the struct avoids the
      warning, but since we cannot have a flexible array member inside
      of the union, each index now has to be moved back by one, which
      makes it a little uglier.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NRémi Denis-Courmont <remi@remlab.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ee01ac61
  11. 10 3月, 2019 5 次提交
    • M
      Bluetooth: Fix locking in bt_accept_enqueue() for BH context · 8d368fc5
      Matthias Kaehlcke 提交于
      commit c4f5627f7eeecde1bb6b646d8c0907b96dc2b2a6 upstream.
      
      With commit e1633762 ("Bluetooth: Handle bt_accept_enqueue() socket
      atomically") lock_sock[_nested]() is used to acquire the socket lock
      before manipulating the socket. lock_sock[_nested]() may block, which
      is problematic since bt_accept_enqueue() can be called in bottom half
      context (e.g. from rfcomm_connect_ind()):
      
      [<ffffff80080d81ec>] __might_sleep+0x4c/0x80
      [<ffffff800876c7b0>] lock_sock_nested+0x24/0x58
      [<ffffff8000d7c27c>] bt_accept_enqueue+0x48/0xd4 [bluetooth]
      [<ffffff8000e67d8c>] rfcomm_connect_ind+0x190/0x218 [rfcomm]
      
      Add a parameter to bt_accept_enqueue() to indicate whether the
      function is called from BH context, and acquire the socket lock
      with bh_lock_sock_nested() if that's the case.
      
      Also adapt all callers of bt_accept_enqueue() to pass the new
      parameter:
      
      - l2cap_sock_new_connection_cb()
        - uses lock_sock() to lock the parent socket => process context
      
      - rfcomm_connect_ind()
        - acquires the parent socket lock with bh_lock_sock() => BH
          context
      
      - __sco_chan_add()
        - called from sco_chan_add(), which is called from sco_connect().
          parent is NULL, hence bt_accept_enqueue() isn't called in this
          code path and we can ignore it
        - also called from sco_conn_ready(). uses bh_lock_sock() to acquire
          the parent lock => BH context
      
      Fixes: e1633762 ("Bluetooth: Handle bt_accept_enqueue() socket atomically")
      Signed-off-by: NMatthias Kaehlcke <mka@chromium.org>
      Reviewed-by: NDouglas Anderson <dianders@chromium.org>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d368fc5
    • N
      net: avoid use IPCB in cipso_v4_error · 125bc1e6
      Nazarov Sergey 提交于
      [ Upstream commit 3da1ed7ac398f34fff1694017a07054d69c5f5c5 ]
      
      Extract IP options in cipso_v4_error and use __icmp_send.
      Signed-off-by: NSergey Nazarov <s-nazarov@yandex.ru>
      Acked-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      125bc1e6
    • N
      net: Add __icmp_send helper. · f2397468
      Nazarov Sergey 提交于
      [ Upstream commit 9ef6b42ad6fd7929dd1b6092cb02014e382c6a91 ]
      
      Add __icmp_send function having ip_options struct parameter
      Signed-off-by: NSergey Nazarov <s-nazarov@yandex.ru>
      Reviewed-by: NPaul Moore <paul@paul-moore.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f2397468
    • H
      ipv4: Add ICMPv6 support when parse route ipproto · 99ed9458
      Hangbin Liu 提交于
      [ Upstream commit 5e1a99eae84999a2536f50a0beaf5d5262337f40 ]
      
      For ip rules, we need to use 'ipproto ipv6-icmp' to match ICMPv6 headers.
      But for ip -6 route, currently we only support tcp, udp and icmp.
      
      Add ICMPv6 support so we can match ipv6-icmp rules for route lookup.
      
      v2: As David Ahern and Sabrina Dubroca suggested, Add an argument to
      rtm_getroute_parse_ip_proto() to handle ICMP/ICMPv6 with different family.
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Fixes: eacb9384 ("ipv6: support sport, dport and ip_proto in RTM_GETROUTE")
      Signed-off-by: NHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99ed9458
    • E
      net: sched: put back q.qlen into a single location · 3043bfe0
      Eric Dumazet 提交于
      [ Upstream commit 46b1c18f9deb326a7e18348e668e4c7ab7c7458b ]
      
      In the series fc8b81a5 ("Merge branch 'lockless-qdisc-series'")
      John made the assumption that the data path had no need to read
      the qdisc qlen (number of packets in the qdisc).
      
      It is true when pfifo_fast is used as the root qdisc, or as direct MQ/MQPRIO
      children.
      
      But pfifo_fast can be used as leaf in class full qdiscs, and existing
      logic needs to access the child qlen in an efficient way.
      
      HTB breaks badly, since it uses cl->leaf.q->q.qlen in :
        htb_activate() -> WARN_ON()
        htb_dequeue_tree() to decide if a class can be htb_deactivated
        when it has no more packets.
      
      HFSC, DRR, CBQ, QFQ have similar issues, and some calls to
      qdisc_tree_reduce_backlog() also read q.qlen directly.
      
      Using qdisc_qlen_sum() (which iterates over all possible cpus)
      in the data path is a non starter.
      
      It seems we have to put back qlen in a central location,
      at least for stable kernels.
      
      For all qdisc but pfifo_fast, qlen is guarded by the qdisc lock,
      so the existing q.qlen{++|--} are correct.
      
      For 'lockless' qdisc (pfifo_fast so far), we need to use atomic_{inc|dec}()
      because the spinlock might be not held (for example from
      pfifo_fast_enqueue() and pfifo_fast_dequeue())
      
      This patch adds atomic_qlen (in the same location than qlen)
      and renames the following helpers, since we want to express
      they can be used without qdisc lock, and that qlen is no longer percpu.
      
      - qdisc_qstats_cpu_qlen_dec -> qdisc_qstats_atomic_qlen_dec()
      - qdisc_qstats_cpu_qlen_inc -> qdisc_qstats_atomic_qlen_inc()
      
      Later (net-next) we might revert this patch by tracking all these
      qlen uses and replace them by a more efficient method (not having
      to access a precise qlen, but an empty/non_empty status that might
      be less expensive to maintain/track).
      
      Another possibility is to have a legacy pfifo_fast version that would
      be used when used a a child qdisc, since the parent qdisc needs
      a spinlock anyway. But then, future lockless qdiscs would also
      have the same problem.
      
      Fixes: 7e66016f ("net: sched: helpers to sum qlen and qlen for per cpu logic")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3043bfe0
  12. 27 2月, 2019 1 次提交
    • W
      netfilter: nft_flow_offload: fix interaction with vrf slave device · 6d26c375
      wenxu 提交于
      [ Upstream commit 10f4e765879e514e1ce7f52ed26603047af196e2 ]
      
      In the forward chain, the iif is changed from slave device to master vrf
      device. Thus, flow offload does not find a match on the lower slave
      device.
      
      This patch uses the cached route, ie. dst->dev, to update the iif and
      oif fields in the flow entry.
      
      After this patch, the following example works fine:
      
       # ip addr add dev eth0 1.1.1.1/24
       # ip addr add dev eth1 10.0.0.1/24
       # ip link add user1 type vrf table 1
       # ip l set user1 up
       # ip l set dev eth0 master user1
       # ip l set dev eth1 master user1
      
       # nft add table firewall
       # nft add flowtable f fb1 { hook ingress priority 0 \; devices = { eth0, eth1 } \; }
       # nft add chain f ftb-all {type filter hook forward priority 0 \; policy accept \; }
       # nft add rule f ftb-all ct zone 1 ip protocol tcp flow offload @fb1
       # nft add rule f ftb-all ct zone 1 ip protocol udp flow offload @fb1
      Signed-off-by: Nwenxu <wenxu@ucloud.cn>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      6d26c375
  13. 23 2月, 2019 2 次提交
    • E
      ax25: fix possible use-after-free · eed11c69
      Eric Dumazet 提交于
      commit 63530aba7826a0f8e129874df9c4d264f9db3f9e upstream.
      
      syzbot found that ax25 routes where not properly protected
      against concurrent use [1].
      
      In this particular report the bug happened while
      copying ax25->digipeat.
      
      Fix this problem by making sure we call ax25_get_route()
      while ax25_route_lock is held, so that no modification
      could happen while using the route.
      
      The current two ax25_get_route() callers do not sleep,
      so this change should be fine.
      
      Once we do that, ax25_get_route() no longer needs to
      grab a reference on the found route.
      
      [1]
      ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de
      BUG: KASAN: use-after-free in memcpy include/linux/string.h:352 [inline]
      BUG: KASAN: use-after-free in kmemdup+0x42/0x60 mm/util.c:113
      Read of size 66 at addr ffff888066641a80 by task syz-executor2/531
      
      ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de
      CPU: 1 PID: 531 Comm: syz-executor2 Not tainted 5.0.0-rc2+ #10
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1db/0x2d0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
       kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       check_memory_region_inline mm/kasan/generic.c:185 [inline]
       check_memory_region+0x123/0x190 mm/kasan/generic.c:191
       memcpy+0x24/0x50 mm/kasan/common.c:130
       memcpy include/linux/string.h:352 [inline]
       kmemdup+0x42/0x60 mm/util.c:113
       kmemdup include/linux/string.h:425 [inline]
       ax25_rt_autobind+0x25d/0x750 net/ax25/ax25_route.c:424
       ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1224
       __sys_connect+0x357/0x490 net/socket.c:1664
       __do_sys_connect net/socket.c:1675 [inline]
       __se_sys_connect net/socket.c:1672 [inline]
       __x64_sys_connect+0x73/0xb0 net/socket.c:1672
       do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x458099
      Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f870ee22c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099
      RDX: 0000000000000048 RSI: 0000000020000080 RDI: 0000000000000005
      RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
      ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f870ee236d4
      R13: 00000000004be48e R14: 00000000004ce9a8 R15: 00000000ffffffff
      
      Allocated by task 526:
       save_stack+0x45/0xd0 mm/kasan/common.c:73
       set_track mm/kasan/common.c:85 [inline]
       __kasan_kmalloc mm/kasan/common.c:496 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469
       kasan_kmalloc+0x9/0x10 mm/kasan/common.c:504
      ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de
       kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3609
       kmalloc include/linux/slab.h:545 [inline]
       ax25_rt_add net/ax25/ax25_route.c:95 [inline]
       ax25_rt_ioctl+0x3b9/0x1270 net/ax25/ax25_route.c:233
       ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763
       sock_do_ioctl+0xe2/0x400 net/socket.c:950
       sock_ioctl+0x32f/0x6c0 net/socket.c:1074
       vfs_ioctl fs/ioctl.c:46 [inline]
       file_ioctl fs/ioctl.c:509 [inline]
       do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696
       ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
       __do_sys_ioctl fs/ioctl.c:720 [inline]
       __se_sys_ioctl fs/ioctl.c:718 [inline]
       __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
       do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de
      Freed by task 550:
       save_stack+0x45/0xd0 mm/kasan/common.c:73
       set_track mm/kasan/common.c:85 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:458
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:466
       __cache_free mm/slab.c:3487 [inline]
       kfree+0xcf/0x230 mm/slab.c:3806
       ax25_rt_add net/ax25/ax25_route.c:92 [inline]
       ax25_rt_ioctl+0x304/0x1270 net/ax25/ax25_route.c:233
       ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763
       sock_do_ioctl+0xe2/0x400 net/socket.c:950
       sock_ioctl+0x32f/0x6c0 net/socket.c:1074
       vfs_ioctl fs/ioctl.c:46 [inline]
       file_ioctl fs/ioctl.c:509 [inline]
       do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696
       ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
       __do_sys_ioctl fs/ioctl.c:720 [inline]
       __se_sys_ioctl fs/ioctl.c:718 [inline]
       __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
       do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff888066641a80
       which belongs to the cache kmalloc-96 of size 96
      The buggy address is located 0 bytes inside of
       96-byte region [ffff888066641a80, ffff888066641ae0)
      The buggy address belongs to the page:
      page:ffffea0001999040 count:1 mapcount:0 mapping:ffff88812c3f04c0 index:0x0
      flags: 0x1fffc0000000200(slab)
      ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de
      raw: 01fffc0000000200 ffffea0001817948 ffffea0002341dc8 ffff88812c3f04c0
      raw: 0000000000000000 ffff888066641000 0000000100000020 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff888066641980: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
       ffff888066641a00: 00 00 00 00 00 00 00 00 02 fc fc fc fc fc fc fc
      >ffff888066641a80: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
                         ^
       ffff888066641b00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
       ffff888066641b80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eed11c69
    • L
      net: ipv4: use a dedicated counter for icmp_v4 redirect packets · 1764111c
      Lorenzo Bianconi 提交于
      [ Upstream commit c09551c6ff7fe16a79a42133bcecba5fc2fc3291 ]
      
      According to the algorithm described in the comment block at the
      beginning of ip_rt_send_redirect, the host should try to send
      'ip_rt_redirect_number' ICMP redirect packets with an exponential
      backoff and then stop sending them at all assuming that the destination
      ignores redirects.
      If the device has previously sent some ICMP error packets that are
      rate-limited (e.g TTL expired) and continues to receive traffic,
      the redirect packets will never be transmitted. This happens since
      peer->rate_tokens will be typically greater than 'ip_rt_redirect_number'
      and so it will never be reset even if the redirect silence timeout
      (ip_rt_redirect_silence) has elapsed without receiving any packet
      requiring redirects.
      
      Fix it by using a dedicated counter for the number of ICMP redirect
      packets that has been sent by the host
      
      I have not been able to identify a given commit that introduced the
      issue since ip_rt_send_redirect implements the same rate-limiting
      algorithm from commit 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      1764111c
  14. 07 2月, 2019 1 次提交
    • D
      ipvlan, l3mdev: fix broken l3s mode wrt local routes · fcc9c69a
      Daniel Borkmann 提交于
      [ Upstream commit d5256083f62e2720f75bb3c5a928a0afe47d6bc3 ]
      
      While implementing ipvlan l3 and l3s mode for kubernetes CNI plugin,
      I ran into the issue that while l3 mode is working fine, l3s mode
      does not have any connectivity to kube-apiserver and hence all pods
      end up in Error state as well. The ipvlan master device sits on
      top of a bond device and hostns traffic to kube-apiserver (also running
      in hostns) is DNATed from 10.152.183.1:443 to 139.178.29.207:37573
      where the latter is the address of the bond0. While in l3 mode, a
      curl to https://10.152.183.1:443 or to https://139.178.29.207:37573
      works fine from hostns, neither of them do in case of l3s. In the
      latter only a curl to https://127.0.0.1:37573 appeared to work where
      for local addresses of bond0 I saw kernel suddenly starting to emit
      ARP requests to query HW address of bond0 which remained unanswered
      and neighbor entries in INCOMPLETE state. These ARP requests only
      happen while in l3s.
      
      Debugging this further, I found the issue is that l3s mode is piggy-
      backing on l3 master device, and in this case local routes are using
      l3mdev_master_dev_rcu(dev) instead of net->loopback_dev as per commit
      f5a0aab8 ("net: ipv4: dst for local input routes should use l3mdev
      if relevant") and 5f02ce24 ("net: l3mdev: Allow the l3mdev to be
      a loopback"). I found that reverting them back into using the
      net->loopback_dev fixed ipvlan l3s connectivity and got everything
      working for the CNI.
      
      Now judging from 4fbae7d8 ("ipvlan: Introduce l3s mode") and the
      l3mdev paper in [0] the only sole reason why ipvlan l3s is relying
      on l3 master device is to get the l3mdev_ip_rcv() receive hook for
      setting the dst entry of the input route without adding its own
      ipvlan specific hacks into the receive path, however, any l3 domain
      semantics beyond just that are breaking l3s operation. Note that
      ipvlan also has the ability to dynamically switch its internal
      operation from l3 to l3s for all ports via ipvlan_set_port_mode()
      at runtime. In any case, l3 vs l3s soley distinguishes itself by
      'de-confusing' netfilter through switching skb->dev to ipvlan slave
      device late in NF_INET_LOCAL_IN before handing the skb to L4.
      
      Minimal fix taken here is to add a IFF_L3MDEV_RX_HANDLER flag which,
      if set from ipvlan setup, gets us only the wanted l3mdev_l3_rcv() hook
      without any additional l3mdev semantics on top. This should also have
      minimal impact since dev->priv_flags is already hot in cache. With
      this set, l3s mode is working fine and I also get things like
      masquerading pod traffic on the ipvlan master properly working.
      
        [0] https://netdevconf.org/1.2/papers/ahern-what-is-l3mdev-paper.pdf
      
      Fixes: f5a0aab8 ("net: ipv4: dst for local input routes should use l3mdev if relevant")
      Fixes: 5f02ce24 ("net: l3mdev: Allow the l3mdev to be a loopback")
      Fixes: 4fbae7d8 ("ipvlan: Introduce l3s mode")
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: David Ahern <dsa@cumulusnetworks.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Martynas Pumputis <m@lambda.lt>
      Acked-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fcc9c69a
  15. 31 1月, 2019 1 次提交
    • I
      net: ipv4: Fix memory leak in network namespace dismantle · adbf7e58
      Ido Schimmel 提交于
      [ Upstream commit f97f4dd8b3bb9d0993d2491e0f22024c68109184 ]
      
      IPv4 routing tables are flushed in two cases:
      
      1. In response to events in the netdev and inetaddr notification chains
      2. When a network namespace is being dismantled
      
      In both cases only routes associated with a dead nexthop group are
      flushed. However, a nexthop group will only be marked as dead in case it
      is populated with actual nexthops using a nexthop device. This is not
      the case when the route in question is an error route (e.g.,
      'blackhole', 'unreachable').
      
      Therefore, when a network namespace is being dismantled such routes are
      not flushed and leaked [1].
      
      To reproduce:
      # ip netns add blue
      # ip -n blue route add unreachable 192.0.2.0/24
      # ip netns del blue
      
      Fix this by not skipping error routes that are not marked with
      RTNH_F_DEAD when flushing the routing tables.
      
      To prevent the flushing of such routes in case #1, add a parameter to
      fib_table_flush() that indicates if the table is flushed as part of
      namespace dismantle or not.
      
      Note that this problem does not exist in IPv6 since error routes are
      associated with the loopback device.
      
      [1]
      unreferenced object 0xffff888066650338 (size 56):
        comm "ip", pid 1206, jiffies 4294786063 (age 26.235s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 b0 1c 62 61 80 88 ff ff  ..........ba....
          e8 8b a1 64 80 88 ff ff 00 07 00 08 fe 00 00 00  ...d............
        backtrace:
          [<00000000856ed27d>] inet_rtm_newroute+0x129/0x220
          [<00000000fcdfc00a>] rtnetlink_rcv_msg+0x397/0xa20
          [<00000000cb85801a>] netlink_rcv_skb+0x132/0x380
          [<00000000ebc991d2>] netlink_unicast+0x4c0/0x690
          [<0000000014f62875>] netlink_sendmsg+0x929/0xe10
          [<00000000bac9d967>] sock_sendmsg+0xc8/0x110
          [<00000000223e6485>] ___sys_sendmsg+0x77a/0x8f0
          [<000000002e94f880>] __sys_sendmsg+0xf7/0x250
          [<00000000ccb1fa72>] do_syscall_64+0x14d/0x610
          [<00000000ffbe3dae>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<000000003a8b605b>] 0xffffffffffffffff
      unreferenced object 0xffff888061621c88 (size 48):
        comm "ip", pid 1206, jiffies 4294786063 (age 26.235s)
        hex dump (first 32 bytes):
          6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
          6b 6b 6b 6b 6b 6b 6b 6b d8 8e 26 5f 80 88 ff ff  kkkkkkkk..&_....
        backtrace:
          [<00000000733609e3>] fib_table_insert+0x978/0x1500
          [<00000000856ed27d>] inet_rtm_newroute+0x129/0x220
          [<00000000fcdfc00a>] rtnetlink_rcv_msg+0x397/0xa20
          [<00000000cb85801a>] netlink_rcv_skb+0x132/0x380
          [<00000000ebc991d2>] netlink_unicast+0x4c0/0x690
          [<0000000014f62875>] netlink_sendmsg+0x929/0xe10
          [<00000000bac9d967>] sock_sendmsg+0xc8/0x110
          [<00000000223e6485>] ___sys_sendmsg+0x77a/0x8f0
          [<000000002e94f880>] __sys_sendmsg+0xf7/0x250
          [<00000000ccb1fa72>] do_syscall_64+0x14d/0x610
          [<00000000ffbe3dae>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<000000003a8b605b>] 0xffffffffffffffff
      
      Fixes: 8cced9ef ("[NETNS]: Enable routing configuration in non-initial namespace.")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      adbf7e58
  16. 23 1月, 2019 2 次提交
  17. 10 1月, 2019 2 次提交
    • D
      sock: Make sock->sk_stamp thread-safe · 60f05ddd
      Deepa Dinamani 提交于
      [ Upstream commit 3a0ed3e9619738067214871e9cb826fa23b2ddb9 ]
      
      Al Viro mentioned (Message-ID
      <20170626041334.GZ10672@ZenIV.linux.org.uk>)
      that there is probably a race condition
      lurking in accesses of sk_stamp on 32-bit machines.
      
      sock->sk_stamp is of type ktime_t which is always an s64.
      On a 32 bit architecture, we might run into situations of
      unsafe access as the access to the field becomes non atomic.
      
      Use seqlocks for synchronization.
      This allows us to avoid using spinlocks for readers as
      readers do not need mutual exclusion.
      
      Another approach to solve this is to require sk_lock for all
      modifications of the timestamps. The current approach allows
      for timestamps to have their own lock: sk_stamp_lock.
      This allows for the patch to not compete with already
      existing critical sections, and side effects are limited
      to the paths in the patch.
      
      The addition of the new field maintains the data locality
      optimizations from
      commit 9115e8cd ("net: reorganize struct sock for better data
      locality")
      
      Note that all the instances of the sk_stamp accesses
      are either through the ioctl or the syscall recvmsg.
      Signed-off-by: NDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      60f05ddd
    • W
      ip: validate header length on virtual device xmit · cde81154
      Willem de Bruijn 提交于
      [ Upstream commit cb9f1b783850b14cbd7f87d061d784a666dfba1f ]
      
      KMSAN detected read beyond end of buffer in vti and sit devices when
      passing truncated packets with PF_PACKET. The issue affects additional
      ip tunnel devices.
      
      Extend commit 76c0ddd8 ("ip6_tunnel: be careful when accessing the
      inner header") and commit ccfec9e5 ("ip_tunnel: be careful when
      accessing the inner header").
      
      Move the check to a separate helper and call at the start of each
      ndo_start_xmit function in net/ipv4 and net/ipv6.
      
      Minor changes:
      - convert dev_kfree_skb to kfree_skb on error path,
        as dev_kfree_skb calls consume_skb which is not for error paths.
      - use pskb_network_may_pull even though that is pedantic here,
        as the same as pskb_may_pull for devices without llheaders.
      - do not cache ipv6 hdrs if used only once
        (unsafe across pskb_may_pull, was more relevant to earlier patch)
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cde81154
  18. 29 12月, 2018 1 次提交
  19. 17 12月, 2018 3 次提交
  20. 01 12月, 2018 1 次提交