1. 09 3月, 2019 11 次提交
  2. 08 3月, 2019 12 次提交
    • E
      net/hsr: fix possible crash in add_timer() · 1e027960
      Eric Dumazet 提交于
      syzbot found another add_timer() issue, this time in net/hsr [1]
      
      Let's use mod_timer() which is safe.
      
      [1]
      kernel BUG at kernel/time/timer.c:1136!
      invalid opcode: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 15909 Comm: syz-executor.3 Not tainted 5.0.0+ #97
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      kobject: 'loop2' (00000000f5629718): kobject_uevent_env
      RIP: 0010:add_timer kernel/time/timer.c:1136 [inline]
      RIP: 0010:add_timer+0x654/0xbe0 kernel/time/timer.c:1134
      Code: 0f 94 c5 31 ff 44 89 ee e8 09 61 0f 00 45 84 ed 0f 84 77 fd ff ff e8 bb 5f 0f 00 e8 07 10 a0 ff e9 68 fd ff ff e8 ac 5f 0f 00 <0f> 0b e8 a5 5f 0f 00 0f 0b e8 9e 5f 0f 00 4c 89 b5 58 ff ff ff e9
      RSP: 0018:ffff8880656eeca0 EFLAGS: 00010246
      kobject: 'loop2' (00000000f5629718): fill_kobj_path: path = '/devices/virtual/block/loop2'
      RAX: 0000000000040000 RBX: 1ffff1100caddd9a RCX: ffffc9000c436000
      RDX: 0000000000040000 RSI: ffffffff816056c4 RDI: ffff88806a2f6cc8
      RBP: ffff8880656eed58 R08: ffff888067f4a300 R09: ffff888067f4abc8
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff88806a2f6cc0
      R13: dffffc0000000000 R14: 0000000000000001 R15: ffff8880656eed30
      FS:  00007fc2019bf700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000738000 CR3: 0000000067e8e000 CR4: 00000000001406f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       hsr_check_announce net/hsr/hsr_device.c:99 [inline]
       hsr_check_carrier_and_operstate+0x567/0x6f0 net/hsr/hsr_device.c:120
       hsr_netdev_notify+0x297/0xa00 net/hsr/hsr_main.c:51
       notifier_call_chain+0xc7/0x240 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394 [inline]
       raw_notifier_call_chain+0x2e/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1739
       call_netdevice_notifiers_extack net/core/dev.c:1751 [inline]
       call_netdevice_notifiers net/core/dev.c:1765 [inline]
       dev_open net/core/dev.c:1436 [inline]
       dev_open+0x143/0x160 net/core/dev.c:1424
       team_port_add drivers/net/team/team.c:1203 [inline]
       team_add_slave+0xa07/0x15d0 drivers/net/team/team.c:1933
       do_set_master net/core/rtnetlink.c:2358 [inline]
       do_set_master+0x1d4/0x230 net/core/rtnetlink.c:2332
       do_setlink+0x966/0x3510 net/core/rtnetlink.c:2493
       rtnl_setlink+0x271/0x3b0 net/core/rtnetlink.c:2747
       rtnetlink_rcv_msg+0x465/0xb00 net/core/rtnetlink.c:5192
       netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2485
       rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5210
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1925
       sock_sendmsg_nosec net/socket.c:622 [inline]
       sock_sendmsg+0xdd/0x130 net/socket.c:632
       sock_write_iter+0x27c/0x3e0 net/socket.c:923
       call_write_iter include/linux/fs.h:1869 [inline]
       do_iter_readv_writev+0x5e0/0x8e0 fs/read_write.c:680
       do_iter_write fs/read_write.c:956 [inline]
       do_iter_write+0x184/0x610 fs/read_write.c:937
       vfs_writev+0x1b3/0x2f0 fs/read_write.c:1001
       do_writev+0xf6/0x290 fs/read_write.c:1036
       __do_sys_writev fs/read_write.c:1109 [inline]
       __se_sys_writev fs/read_write.c:1106 [inline]
       __x64_sys_writev+0x75/0xb0 fs/read_write.c:1106
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457f29
      Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fc2019bec78 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457f29
      RDX: 0000000000000001 RSI: 00000000200000c0 RDI: 0000000000000003
      RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc2019bf6d4
      R13: 00000000004c4a60 R14: 00000000004dd218 R15: 00000000ffffffff
      
      Fixes: f421436a ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Arvid Brodin <arvid.brodin@alten.se>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      1e027960
    • D
      nfp: fix simple vNIC mailbox length · eaab2d2d
      Dirk van der Merwe 提交于
      The simple vNIC mailbox length should be 12 decimal and not 0x12.
      Using a decimal also makes it clear this is a length value and not
      another field within the simple mailbox defines.
      
      Found by code inspection, there are no known firmware configurations
      where this would cause issues.
      
      Fixes: 527d7d1b ("nfp: read mailbox address from TLV caps")
      Signed-off-by: NDirk van der Merwe <dirk.vandermerwe@netronome.com>
      Reviewed-by: NJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      eaab2d2d
    • N
      net: atm: Add another IS_ENABLED(CONFIG_COMPAT) in atm_dev_ioctl · 0805a4b8
      Nathan Chancellor 提交于
      I removed compat's universal assignment to 0, which allows this if
      statement to fall through when compat is passed with a value other
      than 0.
      
      Fixes: f9d19a74 ("net: atm: Use IS_ENABLED in atm_dev_ioctl")
      Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0805a4b8
    • N
      net: stmmac: Avoid sometimes uninitialized Clang warnings · df103170
      Nathan Chancellor 提交于
      When building with -Wsometimes-uninitialized, Clang warns:
      
      drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:495:3: warning: variable 'ns' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
      drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:495:3: warning: variable 'ns' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
      drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:532:3: warning: variable 'ns' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
      drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:532:3: warning: variable 'ns' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
      drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:741:3: warning: variable 'sec_inc' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
      drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:741:3: warning: variable 'sec_inc' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
      
      Clang is concerned with the use of stmmac_do_void_callback (which
      stmmac_get_timestamp and stmmac_config_sub_second_increment wrap),
      as it may fail to initialize these values if the if condition was ever
      false (meaning the callbacks don't exist). It's not wrong because the
      callbacks (get_timestamp and config_sub_second_increment respectively)
      are the ones that initialize the variables. While it's unlikely that the
      callbacks are ever going to disappear and make that condition false, we
      can easily avoid this warning by zero initialize the variables.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/384Suggested-by: NNick Desaulniers <ndesaulniers@google.com>
      Reviewed-by: NNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      df103170
    • N
      net: atm: Use IS_ENABLED in atm_dev_ioctl · f9d19a74
      Nathan Chancellor 提交于
      When building with -Wsometimes-uninitialized, Clang warns:
      
      net/atm/resources.c:256:6: warning: variable 'number' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
      net/atm/resources.c:212:7: warning: variable 'iobuf_len' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
      
      Clang won't realize that compat is 0 when CONFIG_COMPAT is not set until
      the constant folding stage, which happens after this semantic analysis.
      Use IS_ENABLED instead so that the zero is present at the semantic
      analysis stage, which eliminates this warning.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/386Signed-off-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f9d19a74
    • A
      ethtool: reduce stack usage with clang · 3499e87e
      Arnd Bergmann 提交于
      clang inlines the dev_ethtool() more aggressively than gcc does, leading
      to a larger amount of used stack space:
      
      net/core/ethtool.c:2536:24: error: stack frame size of 1216 bytes in function 'dev_ethtool' [-Werror,-Wframe-larger-than=]
      
      Marking the sub-functions that require the most stack space as
      noinline_for_stack gives us reasonable behavior on all compilers.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3499e87e
    • S
      qede: Fix internal loopback failure with jumbo mtu configuration · b89869da
      Sudarsana Reddy Kalluru 提交于
      Driver uses port-mtu as packet-size for the loopback traffic. This patch
      limits the max packet size to 1.5K to avoid data being split over multiple
      buffer descriptors (BDs) in cases where MTU > PAGE_SIZE.
      Signed-off-by: NSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: NAriel Elior <aelior@marvell.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b89869da
    • A
      enic: fix build warning without CONFIG_CPUMASK_OFFSTACK · 43d28166
      Arnd Bergmann 提交于
      The enic driver relies on the CONFIG_CPUMASK_OFFSTACK feature to
      dynamically allocate a struct member, but this is normally intended for
      local variables.
      
      Building with clang, I get a warning for a few locations that check the
      address of the cpumask_var_t:
      
      drivers/net/ethernet/cisco/enic/enic_main.c:122:22: error: address of array 'enic->msix[i].affinity_mask' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
      
      As far as I can tell, the code is still correct, as the truth value of
      the pointer is what we need in this configuration. To get rid of
      the warning, use cpumask_available() instead of checking the
      pointer directly.
      
      Fixes: 322cf7e3 ("enic: assign affinity hint to interrupts")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      43d28166
    • A
      peak_usb: fix clang build warning · a2ae6da0
      Arnd Bergmann 提交于
      Clang points out undefined behavior when building the pcan_usb_pro driver:
      
      drivers/net/can/usb/peak_usb/pcan_usb_pro.c:136:15: error: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Werror,-Wvarargs]
      
      Changing the function prototype to avoid argument promotion in the
      varargs call avoids the warning, and should make this well-defined.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2ae6da0
    • M
      ravb: Decrease TxFIFO depth of Q3 and Q2 to one · ae9819e3
      Masaru Nagai 提交于
      Hardware has the CBS (Credit Based Shaper) which affects only Q3
      and Q2. When updating the CBS settings, even if the driver does so
      after waiting for Tx DMA finished, there is a possibility that frame
      data still remains in TxFIFO.
      
      To avoid this, decrease TxFIFO depth of Q3 and Q2 to one.
      
      This patch has been exercised this using netperf TCP_MAERTS, TCP_STREAM
      and UDP_STREAM tests run on an Ebisu board. No performance change was
      detected, outside of noise in the tests, both in terms of throughput and
      CPU utilisation.
      
      Fixes: c156633f ("Renesas Ethernet AVB driver proper")
      Signed-off-by: NMasaru Nagai <masaru.nagai.vx@renesas.com>
      Signed-off-by: NKazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
      [simon: updated changelog]
      Signed-off-by: NSimon Horman <horms+renesas@verge.net.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ae9819e3
    • A
      isdn: isdnloop: fix pointer dereference bug · 8a72b81e
      Arnd Bergmann 提交于
      clang has spotted an ancient code bug and warns about it with:
      
      drivers/isdn/isdnloop/isdnloop.c:573:12: error: address of array 'card->rcard' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
      
      This is an array of pointers, so we should check if a specific
      pointer exists in the array before using it, not whether the
      array itself exists.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8a72b81e
    • A
      davinci_emac: always build in CONFIG_OF code · f096ca63
      Arnd Bergmann 提交于
      clang warns about what seems to be an unintended use of an obscure C
      language feature where a forward declaration of an array remains usable
      when the final definition is never seen:
      
      drivers/net/ethernet/ti/davinci_emac.c:1694:34: error: tentative array definition assumed to have one element [-Werror]
      static const struct of_device_id davinci_emac_of_match[];
      
      There is no harm in always enabling the device tree matching code here,
      and it makes the code behave in a more conventional way aside from
      avoiding the warning.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Reviewed-by: NNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f096ca63
  3. 07 3月, 2019 9 次提交
    • S
      tcp: do not report TCP_CM_INQ of 0 for closed connections · 6466e715
      Soheil Hassas Yeganeh 提交于
      Returning 0 as inq to userspace indicates there is no more data to
      read, and the application needs to wait for EPOLLIN. For a connection
      that has received FIN from the remote peer, however, the application
      must continue reading until getting EOF (return value of 0
      from tcp_recvmsg) or an error, if edge-triggered epoll (EPOLLET) is
      being used. Otherwise, the application will never receive a new
      EPOLLIN, since there is no epoll edge after the FIN.
      
      Return 1 when there is no data left on the queue but the
      connection has received FIN, so that the applications continue
      reading.
      
      Fixes: b75eba76 (tcp: send in-queue bytes in cmsg upon read)
      Signed-off-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Acked-by: NYuchung Cheng <ycheng@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6466e715
    • M
      net: hsr: fix memory leak in hsr_dev_finalize() · 6caabe7f
      Mao Wenan 提交于
      If hsr_add_port(hsr, hsr_dev, HSR_PT_MASTER) failed to
      add port, it directly returns res and forgets to free the node
      that allocated in hsr_create_self_node(), and forgets to delete
      the node->mac_list linked in hsr->self_node_db.
      
      BUG: memory leak
      unreferenced object 0xffff8881cfa0c780 (size 64):
        comm "syz-executor.0", pid 2077, jiffies 4294717969 (age 2415.377s)
        hex dump (first 32 bytes):
          e0 c7 a0 cf 81 88 ff ff 00 02 00 00 00 00 ad de  ................
          00 e6 49 cd 81 88 ff ff c0 9b 87 d0 81 88 ff ff  ..I.............
        backtrace:
          [<00000000e2ff5070>] hsr_dev_finalize+0x736/0x960 [hsr]
          [<000000003ed2e597>] hsr_newlink+0x2b2/0x3e0 [hsr]
          [<000000003fa8c6b6>] __rtnl_newlink+0xf1f/0x1600 net/core/rtnetlink.c:3182
          [<000000001247a7ad>] rtnl_newlink+0x66/0x90 net/core/rtnetlink.c:3240
          [<00000000e7d1b61d>] rtnetlink_rcv_msg+0x54e/0xb90 net/core/rtnetlink.c:5130
          [<000000005556bd3a>] netlink_rcv_skb+0x129/0x340 net/netlink/af_netlink.c:2477
          [<00000000741d5ee6>] netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
          [<00000000741d5ee6>] netlink_unicast+0x49a/0x650 net/netlink/af_netlink.c:1336
          [<000000009d56f9b7>] netlink_sendmsg+0x88b/0xdf0 net/netlink/af_netlink.c:1917
          [<0000000046b35c59>] sock_sendmsg_nosec net/socket.c:621 [inline]
          [<0000000046b35c59>] sock_sendmsg+0xc3/0x100 net/socket.c:631
          [<00000000d208adc9>] __sys_sendto+0x33e/0x560 net/socket.c:1786
          [<00000000b582837a>] __do_sys_sendto net/socket.c:1798 [inline]
          [<00000000b582837a>] __se_sys_sendto net/socket.c:1794 [inline]
          [<00000000b582837a>] __x64_sys_sendto+0xdd/0x1b0 net/socket.c:1794
          [<00000000c866801d>] do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
          [<00000000fea382d9>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<00000000e01dacb3>] 0xffffffffffffffff
      
      Fixes: c5a75911 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: NMao Wenan <maowenan@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6caabe7f
    • V
      net: sched: flower: insert new filter to idr after setting its mask · ecb3dea4
      Vlad Buslov 提交于
      When adding new filter to flower classifier, fl_change() inserts it to
      handle_idr before initializing filter extensions and assigning it a mask.
      Normally this ordering doesn't matter because all flower classifier ops
      callbacks assume rtnl lock protection. However, when filter has an action
      that doesn't have its kernel module loaded, rtnl lock is released before
      call to request_module(). During this time the filter can be accessed bu
      concurrent task before its initialization is completed, which can lead to a
      crash.
      
      Example case of NULL pointer dereference in concurrent dump:
      
      Task 1                           Task 2
      
      tc_new_tfilter()
       fl_change()
        idr_alloc_u32(fnew)
        fl_set_parms()
         tcf_exts_validate()
          tcf_action_init()
           tcf_action_init_1()
            rtnl_unlock()
            request_module()
            ...                        rtnl_lock()
            				 tc_dump_tfilter()
            				  tcf_chain_dump()
      				   fl_walk()
      				    idr_get_next_ul()
      				    tcf_node_dump()
      				     tcf_fill_node()
      				      fl_dump()
      				       mask = &f->mask->key; <- NULL ptr
            rtnl_lock()
      
      Extension initialization and mask assignment don't depend on fnew->handle
      that is allocated by idr_alloc_u32(). Move idr allocation code after action
      creation and mask assignment in fl_change() to prevent concurrent access
      to not fully initialized filter when rtnl lock is released to load action
      module.
      
      Fixes: 01683a14 ("net: sched: refactor flower walk to iterate over idr")
      Signed-off-by: NVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ecb3dea4
    • V
      tcp: detecting the misuse of .sendpage for Slab objects · a10674bf
      Vasily Averin 提交于
      sendpage was not designed for processing of the Slab pages,
      in some situations it can trigger BUG_ON on receiving side.
      Signed-off-by: NVasily Averin <vvs@virtuozzo.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a10674bf
    • A
      appletalk: Add atalk.h header files to MAINTAINERS file · 7b837623
      Arnd Bergmann 提交于
      Add the path names here so that git-send-email can pick up the
      netdev@vger.kernel.org Cc line automatically for a patch that
      only touches the headers.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7b837623
    • A
      appletalk: Fix compile regression · 27da0d2e
      Arnd Bergmann 提交于
      A bugfix just broke compilation of appletalk when CONFIG_SYSCTL
      is disabled:
      
      In file included from net/appletalk/ddp.c:65:
      net/appletalk/ddp.c: In function 'atalk_init':
      include/linux/atalk.h:164:34: error: expected expression before 'do'
       #define atalk_register_sysctl()  do { } while(0)
                                        ^~
      net/appletalk/ddp.c:1934:7: note: in expansion of macro 'atalk_register_sysctl'
        rc = atalk_register_sysctl();
      
      This is easier to avoid by using conventional inline functions
      as stubs rather than macros. The header already has inline
      functions for other purposes, so I'm changing over all the
      macros for consistency.
      
      Fixes: 6377f787 ("appletalk: Fix use-after-free in atalk_proc_exit")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      27da0d2e
    • A
      iptunnel: NULL pointer deref for ip_md_tunnel_xmit · f4b3ec4e
      Alan Maguire 提交于
      Naresh Kamboju noted the following oops during execution of selftest
      tools/testing/selftests/bpf/test_tunnel.sh on x86_64:
      
      [  274.120445] BUG: unable to handle kernel NULL pointer dereference
      at 0000000000000000
      [  274.128285] #PF error: [INSTR]
      [  274.131351] PGD 8000000414a0e067 P4D 8000000414a0e067 PUD 3b6334067 PMD 0
      [  274.138241] Oops: 0010 [#1] SMP PTI
      [  274.141734] CPU: 1 PID: 11464 Comm: ping Not tainted
      5.0.0-rc4-next-20190129 #1
      [  274.149046] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
      2.0b 07/27/2017
      [  274.156526] RIP: 0010:          (null)
      [  274.160280] Code: Bad RIP value.
      [  274.163509] RSP: 0018:ffffbc9681f83540 EFLAGS: 00010286
      [  274.168726] RAX: 0000000000000000 RBX: ffffdc967fa80a18 RCX: 0000000000000000
      [  274.175851] RDX: ffff9db2ee08b540 RSI: 000000000000000e RDI: ffffdc967fa809a0
      [  274.182974] RBP: ffffbc9681f83580 R08: ffff9db2c4d62690 R09: 000000000000000c
      [  274.190098] R10: 0000000000000000 R11: ffff9db2ee08b540 R12: ffff9db31ce7c000
      [  274.197222] R13: 0000000000000001 R14: 000000000000000c R15: ffff9db3179cf400
      [  274.204346] FS:  00007ff4ae7c5740(0000) GS:ffff9db31fa80000(0000)
      knlGS:0000000000000000
      [  274.212424] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  274.218162] CR2: ffffffffffffffd6 CR3: 00000004574da004 CR4: 00000000003606e0
      [  274.225292] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  274.232416] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  274.239541] Call Trace:
      [  274.241988]  ? tnl_update_pmtu+0x296/0x3b0
      [  274.246085]  ip_md_tunnel_xmit+0x1bc/0x520
      [  274.250176]  gre_fb_xmit+0x330/0x390
      [  274.253754]  gre_tap_xmit+0x128/0x180
      [  274.257414]  dev_hard_start_xmit+0xb7/0x300
      [  274.261598]  sch_direct_xmit+0xf6/0x290
      [  274.265430]  __qdisc_run+0x15d/0x5e0
      [  274.269007]  __dev_queue_xmit+0x2c5/0xc00
      [  274.273011]  ? dev_queue_xmit+0x10/0x20
      [  274.276842]  ? eth_header+0x2b/0xc0
      [  274.280326]  dev_queue_xmit+0x10/0x20
      [  274.283984]  ? dev_queue_xmit+0x10/0x20
      [  274.287813]  arp_xmit+0x1a/0xf0
      [  274.290952]  arp_send_dst.part.19+0x46/0x60
      [  274.295138]  arp_solicit+0x177/0x6b0
      [  274.298708]  ? mod_timer+0x18e/0x440
      [  274.302281]  neigh_probe+0x57/0x70
      [  274.305684]  __neigh_event_send+0x197/0x2d0
      [  274.309862]  neigh_resolve_output+0x18c/0x210
      [  274.314212]  ip_finish_output2+0x257/0x690
      [  274.318304]  ip_finish_output+0x219/0x340
      [  274.322314]  ? ip_finish_output+0x219/0x340
      [  274.326493]  ip_output+0x76/0x240
      [  274.329805]  ? ip_fragment.constprop.53+0x80/0x80
      [  274.334510]  ip_local_out+0x3f/0x70
      [  274.337992]  ip_send_skb+0x19/0x40
      [  274.341391]  ip_push_pending_frames+0x33/0x40
      [  274.345740]  raw_sendmsg+0xc15/0x11d0
      [  274.349403]  ? __might_fault+0x85/0x90
      [  274.353151]  ? _copy_from_user+0x6b/0xa0
      [  274.357070]  ? rw_copy_check_uvector+0x54/0x130
      [  274.361604]  inet_sendmsg+0x42/0x1c0
      [  274.365179]  ? inet_sendmsg+0x42/0x1c0
      [  274.368937]  sock_sendmsg+0x3e/0x50
      [  274.372460]  ___sys_sendmsg+0x26f/0x2d0
      [  274.376293]  ? lock_acquire+0x95/0x190
      [  274.380043]  ? __handle_mm_fault+0x7ce/0xb70
      [  274.384307]  ? lock_acquire+0x95/0x190
      [  274.388053]  ? __audit_syscall_entry+0xdd/0x130
      [  274.392586]  ? ktime_get_coarse_real_ts64+0x64/0xc0
      [  274.397461]  ? __audit_syscall_entry+0xdd/0x130
      [  274.401989]  ? trace_hardirqs_on+0x4c/0x100
      [  274.406173]  __sys_sendmsg+0x63/0xa0
      [  274.409744]  ? __sys_sendmsg+0x63/0xa0
      [  274.413488]  __x64_sys_sendmsg+0x1f/0x30
      [  274.417405]  do_syscall_64+0x55/0x190
      [  274.421064]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  274.426113] RIP: 0033:0x7ff4ae0e6e87
      [  274.429686] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00
      00 00 00 8b 05 ca d9 2b 00 48 63 d2 48 63 ff 85 c0 75 10 b8 2e 00 00
      00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 53 48 89 f3 48 83 ec 10 48 89 7c
      24 08
      [  274.448422] RSP: 002b:00007ffcd9b76db8 EFLAGS: 00000246 ORIG_RAX:
      000000000000002e
      [  274.455978] RAX: ffffffffffffffda RBX: 0000000000000040 RCX: 00007ff4ae0e6e87
      [  274.463104] RDX: 0000000000000000 RSI: 00000000006092e0 RDI: 0000000000000003
      [  274.470228] RBP: 0000000000000000 R08: 00007ffcd9bc40a0 R09: 00007ffcd9bc4080
      [  274.477349] R10: 000000000000060a R11: 0000000000000246 R12: 0000000000000003
      [  274.484475] R13: 0000000000000016 R14: 00007ffcd9b77fa0 R15: 00007ffcd9b78da4
      [  274.491602] Modules linked in: cls_bpf sch_ingress iptable_filter
      ip_tables algif_hash af_alg x86_pkg_temp_thermal fuse [last unloaded:
      test_bpf]
      [  274.504634] CR2: 0000000000000000
      [  274.507976] ---[ end trace 196d18386545eae1 ]---
      [  274.512588] RIP: 0010:          (null)
      [  274.516334] Code: Bad RIP value.
      [  274.519557] RSP: 0018:ffffbc9681f83540 EFLAGS: 00010286
      [  274.524775] RAX: 0000000000000000 RBX: ffffdc967fa80a18 RCX: 0000000000000000
      [  274.531921] RDX: ffff9db2ee08b540 RSI: 000000000000000e RDI: ffffdc967fa809a0
      [  274.539082] RBP: ffffbc9681f83580 R08: ffff9db2c4d62690 R09: 000000000000000c
      [  274.546205] R10: 0000000000000000 R11: ffff9db2ee08b540 R12: ffff9db31ce7c000
      [  274.553329] R13: 0000000000000001 R14: 000000000000000c R15: ffff9db3179cf400
      [  274.560456] FS:  00007ff4ae7c5740(0000) GS:ffff9db31fa80000(0000)
      knlGS:0000000000000000
      [  274.568541] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  274.574277] CR2: ffffffffffffffd6 CR3: 00000004574da004 CR4: 00000000003606e0
      [  274.581403] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  274.588535] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  274.595658] Kernel panic - not syncing: Fatal exception in interrupt
      [  274.602046] Kernel Offset: 0x14400000 from 0xffffffff81000000
      (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      [  274.612827] ---[ end Kernel panic - not syncing: Fatal exception in
      interrupt ]---
      [  274.620387] ------------[ cut here ]------------
      
      I'm also seeing the same failure on x86_64, and it reproduces
      consistently.
      
      >From poking around it looks like the skb's dst entry is being used
      to calculate the mtu in:
      
      mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
      
      ...but because that dst_entry  has an "ops" value set to md_dst_ops,
      the various ops (including mtu) are not set:
      
      crash> struct sk_buff._skb_refdst ffff928f87447700 -x
            _skb_refdst = 0xffffcd6fbf5ea590
      crash> struct dst_entry.ops 0xffffcd6fbf5ea590
        ops = 0xffffffffa0193800
      crash> struct dst_ops.mtu 0xffffffffa0193800
        mtu = 0x0
      crash>
      
      I confirmed that the dst entry also has dst->input set to
      dst_md_discard, so it looks like it's an entry that's been
      initialized via __metadata_dst_init alright.
      
      I think the fix here is to use skb_valid_dst(skb) - it checks
      for  DST_METADATA also, and with that fix in place, the
      problem - which was previously 100% reproducible - disappears.
      
      The below patch resolves the panic and all bpf tunnel tests pass
      without incident.
      
      Fixes: c8b34e68 ("ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit")
      Reported-by: NNaresh Kamboju <naresh.kamboju@linaro.org>
      Signed-off-by: NAlan Maguire <alan.maguire@oracle.com>
      Acked-by: NAlexei Starovoitov <ast@kernel.org>
      Tested-by: NAnders Roxell <anders.roxell@linaro.org>
      Reported-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Tested-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4b3ec4e
    • P
      ipv4/route: fail early when inet dev is missing · 22c74764
      Paolo Abeni 提交于
      If a non local multicast packet reaches ip_route_input_rcu() while
      the ingress device IPv4 private data (in_dev) is NULL, we end up
      doing a NULL pointer dereference in IN_DEV_MFORWARD().
      
      Since the later call to ip_route_input_mc() is going to fail if
      !in_dev, we can fail early in such scenario and avoid the dangerous
      code path.
      
      v1 -> v2:
       - clarified the commit message, no code changes
      Reported-by: NTianhao Zhao <tizhao@redhat.com>
      Fixes: e58e4159 ("net: Enable support for VRF with ipv4 multicast")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22c74764
    • D
      net: hns3: Fix a logical vs bitwise typo · f4772dee
      Dan Carpenter 提交于
      There were a couple logical ORs accidentally mixed in with the bitwise
      ORs.
      
      Fixes: e8149933 ("net: hns3: remove hnae3_get_bit in data path")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f4772dee
  4. 06 3月, 2019 8 次提交
    • W
      net/sched: act_tunnel_key: Fix double free dst_cache · 4177c5d9
      wenxu 提交于
      dst_cache_destroy will be called in dst_release
      
      dst_release-->dst_destroy_rcu-->dst_destroy-->metadata_dst_free
      -->dst_cache_destroy
      
      It should not call dst_cache_destroy before dst_release
      
      Fixes: 41411e2f ("net/sched: act_tunnel_key: Add dst_cache support")
      Signed-off-by: Nwenxu <wenxu@ucloud.cn>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4177c5d9
    • E
      tipc: fix RDM/DGRAM connect() regression · 0e632089
      Erik Hugne 提交于
      Fix regression bug introduced in
      commit 365ad353 ("tipc: reduce risk of user starvation during link
      congestion")
      
      Only signal -EDESTADDRREQ for RDM/DGRAM if we don't have a cached
      sockaddr.
      
      Fixes: 365ad353 ("tipc: reduce risk of user starvation during link congestion")
      Signed-off-by: NErik Hugne <erik.hugne@gmail.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0e632089
    • L
      Merge tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · d9862cfb
      Linus Torvalds 提交于
      Pull MIPS updates from Paul Burton:
      
       - Support for the MIPSr6 MemoryMapID register & Global INValidate TLB
         (GINVT) instructions, allowing for more efficient TLB maintenance
         when running on a CPU such as the I6500 that supports these.
      
       - Enable huge page support for MIPS64r6.
      
       - Optimize post-DMA cache sync by removing that code entirely for
         kernel configurations in which we know it won't be needed.
      
       - The number of pages allocated for interrupt stacks is now calculated
         correctly, where before we would wastefully allocate too much memory
         in some configurations.
      
       - The ath79 platform migrates to devicetree.
      
       - The bcm47xx platform sees fixes for the Buffalo WHR-G54S board.
      
       - The ingenic/jz4740 platform gains support for appended devicetrees.
      
       - The cavium_octeon, lantiq, loongson32 & sgi-ip27 platforms all see
         cleanups as do various pieces of core architecture code.
      
      * tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (66 commits)
        MIPS: lantiq: Remove separate GPHY Firmware loader
        MIPS: ingenic: Add support for appended devicetree
        MIPS: SGI-IP27: rework HUB interrupts
        MIPS: SGI-IP27: do boot CPU init later
        MIPS: SGI-IP27: do xtalk scanning later
        MIPS: SGI-IP27: use pr_info/pr_emerg and pr_cont to fix output
        MIPS: SGI-IP27: clean up bridge access and header files
        MIPS: SGI-IP27: get rid of volatile and hubreg_t
        MIPS: irq: Allocate accurate order pages for irq stack
        MIPS: dma-noncoherent: Remove bogus condition in dma_sync_phys()
        MIPS: eBPF: Remove REG_32BIT_ZERO_EX
        MIPS: eBPF: Always return sign extended 32b values
        MIPS: CM: Fix indentation
        MIPS: BCM47XX: Fix/improve Buffalo WHR-G54S support
        MIPS: OCTEON: program rx/tx-delay always from DT
        MIPS: OCTEON: delete board-specific link status
        MIPS: OCTEON: don't lie about interface type of CN3005 board
        MIPS: OCTEON: warn if deprecated link status is being used
        MIPS: OCTEON: add fixed-link nodes to in-kernel device tree
        MIPS: Delete unused flush_cache_sigtramp()
        ...
      d9862cfb
    • L
      Merge branch 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 8feed3ef
      Linus Torvalds 提交于
      Pull parisc updates from Helge Deller:
       "The most important changes in this patch set are:
      
         - DMA-related cleanups for parisc with the aim to move anything not
           required by drivers out of <asm/dma-mapping.h>, by Christoph
           Hellwig
      
         - Switch to memblock_alloc(), by Mike Rapoport
      
         - Makefile cleanups by Masahiro Yamada
      
         - Switch to bust_spinlocks(), by Sergey Senozhatsky
      
         - Improved initial SMP affinity selection for IRQs
      
         - Added IPI- and rescheduling interrupts in /proc/interrupts output"
      
      * 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (21 commits)
        parisc: use memblock_alloc() instead of custom get_memblock()
        parisc: Add constants for various PDC firmware calls
        parisc: Add constant for PDC_PAT_COMPLEX firmware call
        parisc: Show machine product number during boot
        parisc: Add constants for PDC_RELOCATE PDC call
        parisc: Add PDC_CRASH_PREP PDC function number
        parisc: Use F_EXTEND() macro in iosapic code
        parisc: remove the HBA_DATA macro
        parisc/lba_pci: use container_of in LBA_DEV
        parisc/dino: use container_of in DINO_DEV
        parisc: properly type the return value of parisc_walk_tree
        parisc: properly type the iommu field in struct pci_hba_data
        parisc: turn GET_IOC into an inline function
        parisc: move internal implementation details out of <asm/dma-mapping.h>
        parisc: don't include <asm/cacheflush.h> in <asm/dma-mapping.h>
        parisc: remove meaningless ccflags-y in arch/parisc/boot/Makefile
        parisc: replace oops_in_progress manipulation with bust_spinlocks()
        parisc: Improve initial IRQ to CPU assignment
        parisc: Count IPI function call interrupts
        parisc: Show rescheduling interrupts on SMP machines only
        ...
      8feed3ef
    • L
      Merge tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 3591b195
      Linus Torvalds 提交于
      Pull s390 updates from Martin Schwidefsky:
      
       - A copy of Arnds compat wrapper generation series
      
       - Pass information about the KVM guest to the host in form the control
         program code and the control program version code
      
       - Map IOV resources to support PCI physical functions on s390
      
       - Add vector load and store alignment hints to improve performance
      
       - Use the "jdd" constraint with gcc 9 to make jump labels working again
      
       - Remove amode workaround for old z/VM releases from the DCSS code
      
       - Add support for in-kernel performance measurements using the CPU
         measurement counter facility
      
       - Introduce a new PMU device cpum_cf_diag to capture counters and store
         thenn as event raw data.
      
       - Bug fixes and cleanups
      
      * tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits)
        Revert "s390/cpum_cf: Add kernel message exaplanations"
        s390/dasd: fix read device characteristic with CONFIG_VMAP_STACK=y
        s390/suspend: fix prefix register reset in swsusp_arch_resume
        s390: warn about clearing als implied facilities
        s390: allow overriding facilities via command line
        s390: clean up redundant facilities list setup
        s390/als: remove duplicated in-place implementation of stfle
        s390/cio: Use cpa range elsewhere within vfio-ccw
        s390/cio: Fix vfio-ccw handling of recursive TICs
        s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem
        s390/cpum_cf: Handle EBUSY return code from CPU counter facility reservation
        s390/cpum_cf: Add kernel message exaplanations
        s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace
        s390/cpum_cf: add ctr_stcctm() function
        s390/cpum_cf: move common functions into a separate file
        s390/cpum_cf: introduce kernel_cpumcf_avail() function
        s390/cpu_mf: replace stcctm5() with the stcctm() function
        s390/cpu_mf: add store cpu counter multiple instruction support
        s390/cpum_cf: Add minimal in-kernel interface for counter measurements
        s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts
        ...
      3591b195
    • L
      Merge tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · 45f5532a
      Linus Torvalds 提交于
      Pull m68k updates from Geert Uytterhoeven:
      
       - VLA removal
      
       - gcc-8.x build fixes
      
       - small improvements and cleanups
      
       - defconfig updates
      
      * tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k: Add -ffreestanding to CFLAGS
        m68k/apollo: Fix comment in Makefile
        dio: Fix buffer overflow in case of unknown board
        m68k/defconfig: Update defconfigs for v5.0-rc1
        m68k/atari: Avoid VLA use in atari_switches_setup()
        m68k: Avoid VLA use in mangle_kernel_stack()
        m68k/mac: Use '030 reset method on SE/30
        m68k/mac: Remove obsolete comment
        m68k/mac: Skip VIA port setup unless RTC is connected
        m68k/mac: Clean up unused timer definitions
        m68k/defconfig: Drop NET_VENDOR_<FOO>=n
      45f5532a
    • B
      x86: Deprecate a.out support · eac61655
      Borislav Petkov 提交于
      Linux supports ELF binaries for ~25 years now.  a.out coredumping has
      bitrotten quite significantly and would need some fixing to get it into
      shape again but considering how even the toolchains cannot create a.out
      executables in its default configuration, let's deprecate a.out support
      and remove it a couple of releases later, instead.
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NRichard Weinberger <richard@nod.at>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: <linux-api@vger.kernel.org>
      Cc: <linux-fsdevel@vger.kernel.org>
      Cc: lkml <linux-kernel@vger.kernel.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <x86@kernel.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      eac61655
    • L
      a.out: remove core dumping support · 08300f44
      Linus Torvalds 提交于
      We're (finally) phasing out a.out support for good.  As Borislav Petkov
      points out, we've supported ELF binaries for about 25 years by now, and
      coredumping in particular has bitrotted over the years.
      
      None of the tool chains even support generating a.out binaries any more,
      and the plan is to deprecate a.out support entirely for the kernel.  But
      I want to start with just removing the core dumping code, because I can
      still imagine that somebody actually might want to support a.out as a
      simpler biinary format.
      
      Particularly if you generate some random binaries on the fly, ELF is a
      much more complicated format (admittedly ELF also does have a lot of
      toolchain support, mitigating that complexity a lot and you really
      should have moved over in the last 25 years).
      
      So it's at least somewhat possible that somebody out there has some
      workflow that still involves generating and running a.out executables.
      
      In contrast, it's very unlikely that anybody depends on debugging any
      legacy a.out core files.  But regardless, I want this phase-out to be
      done in two steps, so that we can resurrect a.out support (if needed)
      without having to resurrect the core file dumping that is almost
      certainly not needed.
      
      Jann Horn pointed to the <asm/a.out-core.h> file that my first trivial
      cut at this had missed.
      
      And Alan Cox points out that the a.out binary loader _could_ be done in
      user space if somebody wants to, but we might keep just the loader in
      the kernel if somebody really wants it, since the loader isn't that big
      and has no really odd special cases like the core dumping does.
      Acked-by: NBorislav Petkov <bp@alien8.de>
      Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
      Cc: Jann Horn <jannh@google.com>
      Cc: Richard Weinberger <richard@nod.at>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      08300f44