1. 17 6月, 2017 1 次提交
  2. 16 6月, 2017 2 次提交
  3. 15 6月, 2017 4 次提交
  4. 14 6月, 2017 2 次提交
    • M
      caif: Add sockaddr length check before accessing sa_family in connect handler · 20a3d5bf
      Mateusz Jurczyk 提交于
      Verify that the caller-provided sockaddr structure is large enough to
      contain the sa_family field, before accessing it in the connect()
      handler of the AF_CAIF socket. Since the syscall doesn't enforce a minimum
      size of the corresponding memory region, very short sockaddrs (zero or one
      byte long) result in operating on uninitialized memory while referencing
      sa_family.
      Signed-off-by: NMateusz Jurczyk <mjurczyk@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20a3d5bf
    • W
      igmp: acquire pmc lock for ip_mc_clear_src() · c38b7d32
      WANG Cong 提交于
      Andrey reported a use-after-free in add_grec():
      
              for (psf = *psf_list; psf; psf = psf_next) {
      		...
                      psf_next = psf->sf_next;
      
      where the struct ip_sf_list's were already freed by:
      
       kfree+0xe8/0x2b0 mm/slub.c:3882
       ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
       ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
       ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
       inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
       sock_release+0x8d/0x1e0 net/socket.c:597
       sock_close+0x16/0x20 net/socket.c:1072
      
      This happens because we don't hold pmc->lock in ip_mc_clear_src()
      and a parallel mr_ifc_timer timer could jump in and access them.
      
      The RCU lock is there but it is merely for pmc itself, this
      spinlock could actually ensure we don't access them in parallel.
      
      Thanks to Eric and Long for discussion on this bug.
      Reported-by: NAndrey Konovalov <andreyknvl@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Xin Long <lucien.xin@gmail.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c38b7d32
  5. 13 6月, 2017 8 次提交
  6. 12 6月, 2017 2 次提交
  7. 11 6月, 2017 4 次提交
    • J
      net: tipc: Fix a sleep-in-atomic bug in tipc_msg_reverse · 343eba69
      Jia-Ju Bai 提交于
      The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the
      function call path is:
      tipc_l2_rcv_msg (acquire the lock by rcu_read_lock)
        tipc_rcv
          tipc_sk_rcv
            tipc_msg_reverse
              pskb_expand_head(GFP_KERNEL) --> may sleep
      tipc_node_broadcast
        tipc_node_xmit_skb
          tipc_node_xmit
            tipc_sk_rcv
              tipc_msg_reverse
                pskb_expand_head(GFP_KERNEL) --> may sleep
      
      To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC".
      Signed-off-by: NJia-Ju Bai <baijiaju1990@163.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      343eba69
    • J
      net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx · f146e872
      Jia-Ju Bai 提交于
      The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the
      function call path is:
      cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock)
        cfctrl_linkdown_req
          cfpkt_create
            cfpkt_create_pfx
              alloc_skb(GFP_KERNEL) --> may sleep
      cfserl_receive (acquire the lock by rcu_read_lock)
        cfpkt_split
          cfpkt_create_pfx
            alloc_skb(GFP_KERNEL) --> may sleep
      
      There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or
      "GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function
      is called under a rcu read lock, instead in interrupt.
      
      To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx.
      Signed-off-by: NJia-Ju Bai <baijiaju1990@163.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f146e872
    • X
      sctp: disable BH in sctp_for_each_endpoint · 581409da
      Xin Long 提交于
      Now sctp holds read_lock when foreach sctp_ep_hashtable without disabling
      BH. If CPU schedules to another thread A at this moment, the thread A may
      be trying to hold the write_lock with disabling BH.
      
      As BH is disabled and CPU cannot schedule back to the thread holding the
      read_lock, while the thread A keeps waiting for the read_lock. A dead
      lock would be triggered by this.
      
      This patch is to fix this dead lock by calling read_lock_bh instead to
      disable BH when holding the read_lock in sctp_for_each_endpoint.
      
      Fixes: 626d16f5 ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
      Reported-by: NXiumei Mu <xmu@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      581409da
    • D
      l2tp: cast l2tp traffic counter to unsigned · 9b3dc0a1
      Dominik Heidler 提交于
      This fixes a counter problem on 32bit systems:
      When the rx_bytes counter reached 2 GiB, it jumpd to (2^64 Bytes - 2GiB) Bytes.
      
      rtnl_link_stats64 has __u64 type and atomic_long_read returns
      atomic_long_t which is signed. Due to the conversation
      we get an incorrect value on 32bit systems if the MSB of
      the atomic_long_t value is set.
      
      CC: Tom Parkin <tparkin@katalix.com>
      Fixes: 7b7c0719 ("l2tp: avoid deadlock in l2tp stats update")
      Signed-off-by: NDominik Heidler <dheidler@suse.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9b3dc0a1
  8. 10 6月, 2017 3 次提交
    • J
      mac80211: free netdev on dev_alloc_name() error · c7a61cba
      Johannes Berg 提交于
      The change to remove free_netdev() from ieee80211_if_free()
      erroneously didn't add the necessary free_netdev() for when
      ieee80211_if_free() is called directly in one place, rather
      than as the priv_destructor. Add the missing call.
      
      Fixes: cf124db5 ("net: Fix inconsistent teardown and release of private netdev state.")
      Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c7a61cba
    • A
      net: rps: send out pending IPI's on CPU hotplug · 773fc8f6
      ashwanth@codeaurora.org 提交于
      IPI's from the victim cpu are not handled in dev_cpu_callback.
      So these pending IPI's would be sent to the remote cpu only when
      NET_RX is scheduled on the victim cpu and since this trigger is
      unpredictable it would result in packet latencies on the remote cpu.
      
      This patch add support to send the pending ipi's of victim cpu.
      Signed-off-by: NAshwanth Goli <ashwanth@codeaurora.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      773fc8f6
    • K
      Fix an intermittent pr_emerg warning about lo becoming free. · f186ce61
      Krister Johansen 提交于
      It looks like this:
      
      Message from syslogd@flamingo at Apr 26 00:45:00 ...
       kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4
      
      They seem to coincide with net namespace teardown.
      
      The message is emitted by netdev_wait_allrefs().
      
      Forced a kdump in netdev_run_todo, but found that the refcount on the lo
      device was already 0 at the time we got to the panic.
      
      Used bcc to check the blocking in netdev_run_todo.  The only places
      where we're off cpu there are in the rcu_barrier() and msleep() calls.
      That behavior is expected.  The msleep time coincides with the amount of
      time we spend waiting for the refcount to reach zero; the rcu_barrier()
      wait times are not excessive.
      
      After looking through the list of callbacks that the netdevice notifiers
      invoke in this path, it appears that the dst_dev_event is the most
      interesting.  The dst_ifdown path places a hold on the loopback_dev as
      part of releasing the dev associated with the original dst cache entry.
      Most of our notifier callbacks are straight-forward, but this one a)
      looks complex, and b) places a hold on the network interface in
      question.
      
      I constructed a new bcc script that watches various events in the
      liftime of a dst cache entry.  Note that dst_ifdown will take a hold on
      the loopback device until the invalidated dst entry gets freed.
      
      [      __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183
          __dst_free
          rcu_nocb_kthread
          kthread
          ret_from_fork
      Acked-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f186ce61
  9. 09 6月, 2017 3 次提交
    • M
      af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers · defbcf2d
      Mateusz Jurczyk 提交于
      Verify that the caller-provided sockaddr structure is large enough to
      contain the sa_family field, before accessing it in bind() and connect()
      handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
      size of the corresponding memory region, very short sockaddrs (zero or
      one byte long) result in operating on uninitialized memory while
      referencing .sa_family.
      Signed-off-by: NMateusz Jurczyk <mjurczyk@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      defbcf2d
    • M
      can: af_can: namespace support: fix lockdep splat: properly initialize spin_lock · 74b7b490
      Marc Kleine-Budde 提交于
      This patch uses spin_lock_init() instead of __SPIN_LOCK_UNLOCKED() to
      initialize the per namespace net->can.can_rcvlists_lock lock to fix this
      lockdep warning:
      
      | INFO: trying to register non-static key.
      | the code is fine but needs lockdep annotation.
      | turning off the locking correctness validator.
      | CPU: 0 PID: 186 Comm: candump Not tainted 4.12.0-rc3+ #47
      | Hardware name: Marvell Kirkwood (Flattened Device Tree)
      | [<c0016644>] (unwind_backtrace) from [<c00139a8>] (show_stack+0x18/0x1c)
      | [<c00139a8>] (show_stack) from [<c0058c8c>] (register_lock_class+0x1e4/0x55c)
      | [<c0058c8c>] (register_lock_class) from [<c005bdfc>] (__lock_acquire+0x148/0x1990)
      | [<c005bdfc>] (__lock_acquire) from [<c005deec>] (lock_acquire+0x174/0x210)
      | [<c005deec>] (lock_acquire) from [<c04a6780>] (_raw_spin_lock+0x50/0x88)
      | [<c04a6780>] (_raw_spin_lock) from [<bf02116c>] (can_rx_register+0x94/0x15c [can])
      | [<bf02116c>] (can_rx_register [can]) from [<bf02a868>] (raw_enable_filters+0x60/0xc0 [can_raw])
      | [<bf02a868>] (raw_enable_filters [can_raw]) from [<bf02ac14>] (raw_enable_allfilters+0x2c/0xa0 [can_raw])
      | [<bf02ac14>] (raw_enable_allfilters [can_raw]) from [<bf02ad38>] (raw_bind+0xb0/0x250 [can_raw])
      | [<bf02ad38>] (raw_bind [can_raw]) from [<c03b5fb8>] (SyS_bind+0x70/0xac)
      | [<c03b5fb8>] (SyS_bind) from [<c000f8c0>] (ret_fast_syscall+0x0/0x1c)
      
      Cc: Mario Kicherer <dev@kicherer.org>
      Acked-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      74b7b490
    • A
      ila_xlat: add missing hash secret initialization · 0db47e3d
      Arnd Bergmann 提交于
      While discussing the possible merits of clang warning about unused initialized
      functions, I found one function that was clearly meant to be called but
      never actually is.
      
      __ila_hash_secret_init() initializes the hash value for the ila locator,
      apparently this is intended to prevent hash collision attacks, but this ends
      up being a read-only zero constant since there is no caller. I could find
      no indication of why it was never called, the earliest patch submission
      for the module already was like this. If my interpretation is right, we
      certainly want to backport the patch to stable kernels as well.
      
      I considered adding it to the ila_xlat_init callback, but for best effect
      the random data is read as late as possible, just before it is first used.
      The underlying net_get_random_once() is already highly optimized to avoid
      overhead when called frequently.
      
      Fixes: 7f00feaf ("ila: Add generic ILA translation facility")
      Cc: stable@vger.kernel.org
      Link: https://www.spinics.net/lists/kernel/msg2527243.htmlSigned-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0db47e3d
  10. 08 6月, 2017 7 次提交
    • D
      net: ipv6: Release route when device is unregistering · 8397ed36
      David Ahern 提交于
      Roopa reported attempts to delete a bond device that is referenced in a
      multipath route is hanging:
      
      $ ifdown bond2    # ifupdown2 command that deletes virtual devices
      unregister_netdevice: waiting for bond2 to become free. Usage count = 2
      
      Steps to reproduce:
          echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown
          ip link add dev bond12 type bond
          ip link add dev bond13 type bond
          ip addr add 2001:db8:2::0/64 dev bond12
          ip addr add 2001:db8:3::0/64 dev bond13
          ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2
          ip link del dev bond12
          ip link del dev bond13
      
      The root cause is the recent change to keep routes on a linkdown. Update
      the check to detect when the device is unregistering and release the
      route for that case.
      
      Fixes: a1a22c12 ("net: ipv6: Keep nexthop of multipath route on admin down")
      Reported-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8397ed36
    • M
      net: Zero ifla_vf_info in rtnl_fill_vfinfo() · 0eed9cf5
      Mintz, Yuval 提交于
      Some of the structure's fields are not initialized by the
      rtnetlink. If driver doesn't set those in ndo_get_vf_config(),
      they'd leak memory to user.
      Signed-off-by: NYuval Mintz <Yuval.Mintz@cavium.com>
      CC: Michal Schmidt <mschmidt@redhat.com>
      Reviewed-by: NGreg Rose <gvrose8192@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0eed9cf5
    • M
      decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb · dd0da17b
      Mateusz Jurczyk 提交于
      Verify that the length of the socket buffer is sufficient to cover the
      nlmsghdr structure before accessing the nlh->nlmsg_len field for further
      input sanitization. If the client only supplies 1-3 bytes of data in
      sk_buff, then nlh->nlmsg_len remains partially uninitialized and
      contains leftover memory from the corresponding kernel allocation.
      Operating on such data may result in indeterminate evaluation of the
      nlmsg_len < sizeof(*nlh) expression.
      
      The bug was discovered by a runtime instrumentation designed to detect
      use of uninitialized memory in the kernel. The patch prevents this and
      other similar tools (e.g. KMSAN) from flagging this behavior in the future.
      Signed-off-by: NMateusz Jurczyk <mjurczyk@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dd0da17b
    • D
      Revert "decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb" · c164772d
      David S. Miller 提交于
      This reverts commit 85eac2ba.
      
      There is an updated version of this fix which we should
      use instead.
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c164772d
    • M
      decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb · 85eac2ba
      Mateusz Jurczyk 提交于
      Verify that the length of the socket buffer is sufficient to cover the
      entire nlh->nlmsg_len field before accessing that field for further
      input sanitization. If the client only supplies 1-3 bytes of data in
      sk_buff, then nlh->nlmsg_len remains partially uninitialized and
      contains leftover memory from the corresponding kernel allocation.
      Operating on such data may result in indeterminate evaluation of the
      nlmsg_len < sizeof(*nlh) expression.
      
      The bug was discovered by a runtime instrumentation designed to detect
      use of uninitialized memory in the kernel. The patch prevents this and
      other similar tools (e.g. KMSAN) from flagging this behavior in the future.
      Signed-off-by: NMateusz Jurczyk <mjurczyk@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      85eac2ba
    • D
      net: Fix inconsistent teardown and release of private netdev state. · cf124db5
      David S. Miller 提交于
      Network devices can allocate reasources and private memory using
      netdev_ops->ndo_init().  However, the release of these resources
      can occur in one of two different places.
      
      Either netdev_ops->ndo_uninit() or netdev->destructor().
      
      The decision of which operation frees the resources depends upon
      whether it is necessary for all netdev refs to be released before it
      is safe to perform the freeing.
      
      netdev_ops->ndo_uninit() presumably can occur right after the
      NETDEV_UNREGISTER notifier completes and the unicast and multicast
      address lists are flushed.
      
      netdev->destructor(), on the other hand, does not run until the
      netdev references all go away.
      
      Further complicating the situation is that netdev->destructor()
      almost universally does also a free_netdev().
      
      This creates a problem for the logic in register_netdevice().
      Because all callers of register_netdevice() manage the freeing
      of the netdev, and invoke free_netdev(dev) if register_netdevice()
      fails.
      
      If netdev_ops->ndo_init() succeeds, but something else fails inside
      of register_netdevice(), it does call ndo_ops->ndo_uninit().  But
      it is not able to invoke netdev->destructor().
      
      This is because netdev->destructor() will do a free_netdev() and
      then the caller of register_netdevice() will do the same.
      
      However, this means that the resources that would normally be released
      by netdev->destructor() will not be.
      
      Over the years drivers have added local hacks to deal with this, by
      invoking their destructor parts by hand when register_netdevice()
      fails.
      
      Many drivers do not try to deal with this, and instead we have leaks.
      
      Let's close this hole by formalizing the distinction between what
      private things need to be freed up by netdev->destructor() and whether
      the driver needs unregister_netdevice() to perform the free_netdev().
      
      netdev->priv_destructor() performs all actions to free up the private
      resources that used to be freed by netdev->destructor(), except for
      free_netdev().
      
      netdev->needs_free_netdev is a boolean that indicates whether
      free_netdev() should be done at the end of unregister_netdevice().
      
      Now, register_netdevice() can sanely release all resources after
      ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit()
      and netdev->priv_destructor().
      
      And at the end of unregister_netdevice(), we invoke
      netdev->priv_destructor() and optionally call free_netdev().
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cf124db5
    • A
      net: don't call strlen on non-terminated string in dev_set_alias() · c28294b9
      Alexander Potapenko 提交于
      KMSAN reported a use of uninitialized memory in dev_set_alias(),
      which was caused by calling strlcpy() (which in turn called strlen())
      on the user-supplied non-terminated string.
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c28294b9
  11. 07 6月, 2017 2 次提交
  12. 06 6月, 2017 1 次提交
  13. 05 6月, 2017 1 次提交