1. 13 9月, 2018 3 次提交
  2. 12 9月, 2018 7 次提交
    • C
      rds: fix two RCU related problems · cc4dfb7f
      Cong Wang 提交于
      When a rds sock is bound, it is inserted into the bind_hash_table
      which is protected by RCU. But when releasing rds sock, after it
      is removed from this hash table, it is freed immediately without
      respecting RCU grace period. This could cause some use-after-free
      as reported by syzbot.
      
      Mark the rds sock with SOCK_RCU_FREE before inserting it into the
      bind_hash_table, so that it would be always freed after a RCU grace
      period.
      
      The other problem is in rds_find_bound(), the rds sock could be
      freed in between rhashtable_lookup_fast() and rds_sock_addref(),
      so we need to extend RCU read lock protection in rds_find_bound()
      to close this race condition.
      
      Reported-and-tested-by: syzbot+8967084bcac563795dc6@syzkaller.appspotmail.com
      Reported-by: syzbot+93a5839deb355537440f@syzkaller.appspotmail.com
      Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com>
      Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
      Cc: rds-devel@oss.oracle.com
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oarcle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc4dfb7f
    • K
      r8169: Clear RTL_FLAG_TASK_*_PENDING when clearing RTL_FLAG_TASK_ENABLED · 6ad56901
      Kai-Heng Feng 提交于
      After system suspend, sometimes the r8169 doesn't work when ethernet
      cable gets pluggued.
      
      This issue happens because rtl_reset_work() doesn't get called from
      rtl8169_runtime_resume(), after system suspend.
      
      In rtl_task(), RTL_FLAG_TASK_* only gets cleared if this condition is
      met:
      if (!netif_running(dev) ||
          !test_bit(RTL_FLAG_TASK_ENABLED, tp->wk.flags))
          ...
      
      If RTL_FLAG_TASK_ENABLED was cleared during system suspend while
      RTL_FLAG_TASK_RESET_PENDING was set, the next rtl_schedule_task() won't
      schedule task as the flag is still there.
      
      So in addition to clearing RTL_FLAG_TASK_ENABLED, also clears other
      flags.
      
      Cc: Heiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NKai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6ad56901
    • H
      erspan: fix error handling for erspan tunnel · 51dc63e3
      Haishuang Yan 提交于
      When processing icmp unreachable message for erspan tunnel, tunnel id
      should be erspan_net_id instead of ipgre_net_id.
      
      Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN")
      Cc: William Tu <u9012063@gmail.com>
      Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Acked-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      51dc63e3
    • H
      erspan: return PACKET_REJECT when the appropriate tunnel is not found · 5a64506b
      Haishuang Yan 提交于
      If erspan tunnel hasn't been established, we'd better send icmp port
      unreachable message after receive erspan packets.
      
      Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN")
      Cc: William Tu <u9012063@gmail.com>
      Signed-off-by: NHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Acked-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5a64506b
    • W
      tcp: rate limit synflood warnings further · 0297c1c2
      Willem de Bruijn 提交于
      Convert pr_info to net_info_ratelimited to limit the total number of
      synflood warnings.
      
      Commit 946cedcc ("tcp: Change possible SYN flooding messages")
      rate limits synflood warnings to one per listener.
      
      Workloads that open many listener sockets can still see a high rate of
      log messages. Syzkaller is one frequent example.
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0297c1c2
    • H
      MIPS: lantiq: dma: add dev pointer · 2d946e5b
      Hauke Mehrtens 提交于
      dma_zalloc_coherent() now crashes if no dev pointer is given.
      Add a dev pointer to the ltq_dma_channel structure and fill it in the
      driver using it.
      
      This fixes a bug introduced in kernel 4.19.
      Signed-off-by: NHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d946e5b
    • D
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 4ecdf770
      David S. Miller 提交于
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for you net tree:
      
      1) Remove duplicated include at the end of UDP conntrack, from Yue Haibing.
      
      2) Restore conntrack dependency on xt_cluster, from Martin Willi.
      
      3) Fix splat with GSO skbs from the checksum target, from Florian Westphal.
      
      4) Rework ct timeout support, the template strategy to attach custom timeouts
         is not correct since it will not work in conjunction with conntrack zones
         and we have a possible free after use when removing the rule due to missing
         refcounting. To fix these problems, do not use conntrack template at all
         and set custom timeout on the already valid conntrack object. This
         fix comes with a preparation patch to simplify timeout adjustment by
         initializating the first position of the timeout array for all of the
         existing trackers. Patchset from Florian Westphal.
      
      5) Fix missing dependency on from IPv4 chain NAT type, from Florian.
      
      6) Release chain reference counter from the flush path, from Taehee Yoo.
      
      7) After flushing an iptables ruleset, conntrack hooks are unregistered
         and entries are left stale to be cleaned up by the timeout garbage
         collector. No TCP tracking is done on established flows by this time.
         If ruleset is reloaded, then hooks are registered again and TCP
         tracking is restored, which considers packets to be invalid. Clear
         window tracking to exercise TCP flow pickup from the middle given that
         history is lost for us. Again from Florian.
      
      8) Fix crash from netlink interface with CONFIG_NF_CONNTRACK_TIMEOUT=y
         and CONFIG_NF_CT_NETLINK_TIMEOUT=n.
      
      9) Broken CT target due to returning incorrect type from
         ctnl_timeout_find_get().
      
      10) Solve conntrack clash on NF_REPEAT verdicts too, from Michal Vaner.
      
      11) Missing conversion of hashlimit sysctl interface to new API, from
          Cong Wang.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4ecdf770
  3. 11 9月, 2018 7 次提交
  4. 10 9月, 2018 1 次提交
    • T
      ip: frags: fix crash in ip_do_fragment() · 5d407b07
      Taehee Yoo 提交于
      A kernel crash occurrs when defragmented packet is fragmented
      in ip_do_fragment().
      In defragment routine, skb_orphan() is called and
      skb->ip_defrag_offset is set. but skb->sk and
      skb->ip_defrag_offset are same union member. so that
      frag->sk is not NULL.
      Hence crash occurrs in skb->sk check routine in ip_do_fragment() when
      defragmented packet is fragmented.
      
      test commands:
         %iptables -t nat -I POSTROUTING -j MASQUERADE
         %hping3 192.168.4.2 -s 1000 -p 2000 -d 60000
      
      splat looks like:
      [  261.069429] kernel BUG at net/ipv4/ip_output.c:636!
      [  261.075753] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [  261.083854] CPU: 1 PID: 1349 Comm: hping3 Not tainted 4.19.0-rc2+ #3
      [  261.100977] RIP: 0010:ip_do_fragment+0x1613/0x2600
      [  261.106945] Code: e8 e2 38 e3 fe 4c 8b 44 24 18 48 8b 74 24 08 e9 92 f6 ff ff 80 3c 02 00 0f 85 da 07 00 00 48 8b b5 d0 00 00 00 e9 25 f6 ff ff <0f> 0b 0f 0b 44 8b 54 24 58 4c 8b 4c 24 18 4c 8b 5c 24 60 4c 8b 6c
      [  261.127015] RSP: 0018:ffff8801031cf2c0 EFLAGS: 00010202
      [  261.134156] RAX: 1ffff1002297537b RBX: ffffed0020639e6e RCX: 0000000000000004
      [  261.142156] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880114ba9bd8
      [  261.150157] RBP: ffff880114ba8a40 R08: ffffed0022975395 R09: ffffed0022975395
      [  261.158157] R10: 0000000000000001 R11: ffffed0022975394 R12: ffff880114ba9ca4
      [  261.166159] R13: 0000000000000010 R14: ffff880114ba9bc0 R15: dffffc0000000000
      [  261.174169] FS:  00007fbae2199700(0000) GS:ffff88011b400000(0000) knlGS:0000000000000000
      [  261.183012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  261.189013] CR2: 00005579244fe000 CR3: 0000000119bf4000 CR4: 00000000001006e0
      [  261.198158] Call Trace:
      [  261.199018]  ? dst_output+0x180/0x180
      [  261.205011]  ? save_trace+0x300/0x300
      [  261.209018]  ? ip_copy_metadata+0xb00/0xb00
      [  261.213034]  ? sched_clock_local+0xd4/0x140
      [  261.218158]  ? kill_l4proto+0x120/0x120 [nf_conntrack]
      [  261.223014]  ? rt_cpu_seq_stop+0x10/0x10
      [  261.227014]  ? find_held_lock+0x39/0x1c0
      [  261.233008]  ip_finish_output+0x51d/0xb50
      [  261.237006]  ? ip_fragment.constprop.56+0x220/0x220
      [  261.243011]  ? nf_ct_l4proto_register_one+0x5b0/0x5b0 [nf_conntrack]
      [  261.250152]  ? rcu_is_watching+0x77/0x120
      [  261.255010]  ? nf_nat_ipv4_out+0x1e/0x2b0 [nf_nat_ipv4]
      [  261.261033]  ? nf_hook_slow+0xb1/0x160
      [  261.265007]  ip_output+0x1c7/0x710
      [  261.269005]  ? ip_mc_output+0x13f0/0x13f0
      [  261.273002]  ? __local_bh_enable_ip+0xe9/0x1b0
      [  261.278152]  ? ip_fragment.constprop.56+0x220/0x220
      [  261.282996]  ? nf_hook_slow+0xb1/0x160
      [  261.287007]  raw_sendmsg+0x21f9/0x4420
      [  261.291008]  ? dst_output+0x180/0x180
      [  261.297003]  ? sched_clock_cpu+0x126/0x170
      [  261.301003]  ? find_held_lock+0x39/0x1c0
      [  261.306155]  ? stop_critical_timings+0x420/0x420
      [  261.311004]  ? check_flags.part.36+0x450/0x450
      [  261.315005]  ? _raw_spin_unlock_irq+0x29/0x40
      [  261.320995]  ? _raw_spin_unlock_irq+0x29/0x40
      [  261.326142]  ? cyc2ns_read_end+0x10/0x10
      [  261.330139]  ? raw_bind+0x280/0x280
      [  261.334138]  ? sched_clock_cpu+0x126/0x170
      [  261.338995]  ? check_flags.part.36+0x450/0x450
      [  261.342991]  ? __lock_acquire+0x4500/0x4500
      [  261.348994]  ? inet_sendmsg+0x11c/0x500
      [  261.352989]  ? dst_output+0x180/0x180
      [  261.357012]  inet_sendmsg+0x11c/0x500
      [ ... ]
      
      v2:
       - clear skb->sk at reassembly routine.(Eric Dumarzet)
      
      Fixes: fa0f5273 ("ip: use rb trees for IP frag queue.")
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5d407b07
  5. 09 9月, 2018 9 次提交
  6. 08 9月, 2018 4 次提交
  7. 07 9月, 2018 1 次提交
    • C
      tipc: call start and done ops directly in __tipc_nl_compat_dumpit() · 8f5c5fcf
      Cong Wang 提交于
      __tipc_nl_compat_dumpit() uses a netlink_callback on stack,
      so the only way to align it with other ->dumpit() call path
      is calling tipc_dump_start() and tipc_dump_done() directly
      inside it. Otherwise ->dumpit() would always get NULL from
      cb->args[].
      
      But tipc_dump_start() uses sock_net(cb->skb->sk) to retrieve
      net pointer, the cb->skb here doesn't set skb->sk, the net pointer
      is saved in msg->net instead, so introduce a helper function
      __tipc_dump_start() to pass in msg->net.
      
      Ying pointed out cb->args[0...3] are already used by other
      callbacks on this call path, so we can't use cb->args[0] any
      more, use cb->args[4] instead.
      
      Fixes: 9a07efa9 ("tipc: switch to rhashtable iterator")
      Reported-and-tested-by: syzbot+e93a2c41f91b8e2c7d9b@syzkaller.appspotmail.com
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      8f5c5fcf
  8. 06 9月, 2018 8 次提交
    • D
      Merge tag 'mlx5e-fixes-2018-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 6da410d9
      David S. Miller 提交于
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2018-09-05
      
      This pull request contains some fixes for mlx5 etherent netdevice and
      core driver.
      
      Please pull and let me know if there's any problem.
      
      For -stable v4.9:
      ('net/mlx5: Fix debugfs cleanup in the device init/remove flow')
      
      For -stable v4.12:
      ("net/mlx5: E-Switch, Fix memory leak when creating switchdev mode FDB tables")
      
      For -stable v4.13:
      ("net/mlx5: Fix use-after-free in self-healing flow")
      
      For -stable v4.14:
      ("net/mlx5: Check for error in mlx5_attach_interface")
      
      For -stable v4.15:
      ("net/mlx5: Fix not releasing read lock when adding flow rules")
      
      For -stable v4.17:
      ("net/mlx5: Fix possible deadlock from lockdep when adding fte to fg")
      
      For -stable v4.18:
      ("net/mlx5: Use u16 for Work Queue buffer fragment size")
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      6da410d9
    • D
      Merge branch 'iucv-fixes' · fce471e3
      David S. Miller 提交于
      Julian Wiedmann says:
      
      ====================
      net/iucv: fixes 2018-09-05
      
      please apply three straight-forward fixes for iucv. One that prevents
      leaking the skb on malformed inbound packets, one to fix the error
      handling on transmit error, and one to get rid of a compile warning.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      fce471e3
    • J
      net/iucv: declare iucv_path_table_empty() as static · b7f41565
      Julian Wiedmann 提交于
      Fixes a compile warning.
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b7f41565
    • J
      net/af_iucv: fix skb handling on HiperTransport xmit error · b2f54394
      Julian Wiedmann 提交于
      When sending an skb, afiucv_hs_send() bails out on various error
      conditions. But currently the caller has no way of telling whether the
      skb was freed or not - resulting in potentially either
      a) leaked skbs from iucv_send_ctrl(), or
      b) double-free's from iucv_sock_sendmsg().
      
      As dev_queue_xmit() will always consume the skb (even on error), be
      consistent and also free the skb from all other error paths. This way
      callers no longer need to care about managing the skb.
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: NUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2f54394
    • J
      net/af_iucv: drop inbound packets with invalid flags · 22244099
      Julian Wiedmann 提交于
      Inbound packets may have any combination of flag bits set in their iucv
      header. If we don't know how to handle a specific combination, drop the
      skb instead of leaking it.
      
      To clarify what error is returned in this case, replace the hard-coded
      0 with the corresponding macro.
      Signed-off-by: NJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      22244099
    • D
      net/sched: fix memory leak in act_tunnel_key_init() · ee28bb56
      Davide Caratti 提交于
      If users try to install act_tunnel_key 'set' rules with duplicate values
      of 'index', the tunnel metadata are allocated, but never released. Then,
      kmemleak complains as follows:
      
       # tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111
       # echo clear > /sys/kernel/debug/kmemleak
       # tc a a a tunnel_key set src_ip 1.1.1.1 dst_ip 2.2.2.2 id 42 index 111
       Error: TC IDR already exists.
       We have an error talking to the kernel
       # echo scan > /sys/kernel/debug/kmemleak
       # cat /sys/kernel/debug/kmemleak
       unreferenced object 0xffff8800574e6c80 (size 256):
         comm "tc", pid 5617, jiffies 4298118009 (age 57.990s)
         hex dump (first 32 bytes):
           00 00 00 00 00 00 00 00 00 1c e8 b0 ff ff ff ff  ................
           81 24 c2 ad ff ff ff ff 00 00 00 00 00 00 00 00  .$..............
         backtrace:
           [<00000000b7afbf4e>] tunnel_key_init+0x8a5/0x1800 [act_tunnel_key]
           [<000000007d98fccd>] tcf_action_init_1+0x698/0xac0
           [<0000000099b8f7cc>] tcf_action_init+0x15c/0x590
           [<00000000dc60eebe>] tc_ctl_action+0x336/0x5c2
           [<000000002f5a2f7d>] rtnetlink_rcv_msg+0x357/0x8e0
           [<000000000bfe7575>] netlink_rcv_skb+0x124/0x350
           [<00000000edab656f>] netlink_unicast+0x40f/0x5d0
           [<00000000b322cdcb>] netlink_sendmsg+0x6e8/0xba0
           [<0000000063d9d490>] sock_sendmsg+0xb3/0xf0
           [<00000000f0d3315a>] ___sys_sendmsg+0x654/0x960
           [<00000000c06cbd42>] __sys_sendmsg+0xd3/0x170
           [<00000000ce72e4b0>] do_syscall_64+0xa5/0x470
           [<000000005caa2d97>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
           [<00000000fac1b476>] 0xffffffffffffffff
      
      This problem theoretically happens also in case users attempt to setup a
      geneve rule having wrong configuration data, or when the kernel fails to
      allocate 'params_new'. Ensure that tunnel_key_init() releases the tunnel
      metadata also in the above conditions.
      
      Addresses-Coverity-ID: 1373974 ("Resource leak")
      Fixes: d0f6dd8a ("net/sched: Introduce act_tunnel_key")
      Fixes: 0ed5269f ("net/sched: add tunnel option support to act_tunnel_key")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Acked-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ee28bb56
    • C
      tipc: orphan sock in tipc_release() · 0a3b8b2b
      Cong Wang 提交于
      Before we unlock the sock in tipc_release(), we have to
      detach sk->sk_socket from sk, otherwise a parallel
      tipc_sk_fill_sock_diag() could stil read it after we
      free this socket.
      
      Fixes: c30b70de ("tipc: implement socket diagnostics for AF_TIPC")
      Reported-and-tested-by: syzbot+48804b87c16588ad491d@syzkaller.appspotmail.com
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0a3b8b2b
    • R
      net/mlx5: Fix possible deadlock from lockdep when adding fte to fg · ad9421e3
      Roi Dayan 提交于
      This is a false positive report due to incorrect nested lock
      annotations as we lock multiple fgs with the same subclass.
      Instead of locking all fgs only lock the one being used as was
      done before.
      
      Fixes: bd71b08e ("net/mlx5: Support multiple updates of steering rules in parallel")
      Signed-off-by: NRoi Dayan <roid@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      ad9421e3