1. 06 5月, 2019 19 次提交
  2. 05 5月, 2019 4 次提交
  3. 04 5月, 2019 5 次提交
    • E
      net: openvswitch: return an error instead of doing BUG_ON() · a734d1f4
      Eelco Chaudron 提交于
      For all other error cases in queue_userspace_packet() the error is
      returned, so it makes sense to do the same for these two error cases.
      Reported-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NEelco Chaudron <echaudro@redhat.com>
      Acked-by: NFlavio Leitner <fbl@sysclose.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a734d1f4
    • M
      genetlink: do not validate dump requests if there is no policy · 05d7f547
      Michal Kubecek 提交于
      Unlike do requests, dump genetlink requests now perform strict validation
      by default even if the genetlink family does not set policy and maxtype
      because it does validation and parsing on its own (e.g. because it wants to
      allow different message format for different commands). While the null
      policy will be ignored, maxtype (which would be zero) is still checked so
      that any attribute will fail validation.
      
      The solution is to only call __nla_validate() from genl_family_rcv_msg()
      if family->maxtype is set.
      
      Fixes: ef6243ac ("genetlink: optionally validate strictly/dumps")
      Signed-off-by: NMichal Kubecek <mkubecek@suse.cz>
      Reviewed-by: NJohannes Berg <johannes@sipsolutions.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      05d7f547
    • T
      tipc: fix missing Name entries due to half-failover · c0b14a08
      Tuong Lien 提交于
      TIPC link can temporarily fall into "half-establish" that only one of
      the link endpoints is ESTABLISHED and starts to send traffic, PROTOCOL
      messages, whereas the other link endpoint is not up (e.g. immediately
      when the endpoint receives ACTIVATE_MSG, the network interface goes
      down...).
      
      This is a normal situation and will be settled because the link
      endpoint will be eventually brought down after the link tolerance time.
      
      However, the situation will become worse when the second link is
      established before the first link endpoint goes down,
      For example:
      
         1. Both links <1A-2A>, <1B-2B> down
         2. Link endpoint 2A up, but 1A still down (e.g. due to network
            disturbance, wrong session, etc.)
         3. Link <1B-2B> up
         4. Link endpoint 2A down (e.g. due to link tolerance timeout)
         5. Node B starts failover onto link <1B-2B>
      
         ==> Node A does never start link failover.
      
      When the "half-failover" situation happens, two consequences have been
      observed:
      
      a) Peer link/node gets stuck in FAILINGOVER state;
      b) Traffic or user messages that peer node is trying to failover onto
      the second link can be partially or completely dropped by this node.
      
      The consequence a) was actually solved by commit c140eb16 ("tipc:
      fix failover problem"), but that commit didn't cover the b). It's due
      to the fact that the tunnel link endpoint has never been prepared for a
      failover, so the 'l->drop_point' (and the other data...) is not set
      correctly. When a TUNNEL_MSG from peer node arrives on the link,
      depending on the inner message's seqno and the current 'l->drop_point'
      value, the message can be dropped (- treated as a duplicate message) or
      processed.
      At this early stage, the traffic messages from peer are likely to be
      NAME_DISTRIBUTORs, this means some name table entries will be missed on
      the node forever!
      
      The commit resolves the issue by starting the FAILOVER process on this
      node as well. Another benefit from this solution is that we ensure the
      link will not be re-established until the failover ends.
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c0b14a08
    • G
      net: sched: cls_u32: use struct_size() helper · e512fcf0
      Gustavo A. R. Silva 提交于
      Make use of the struct_size() helper instead of an open-coded version
      in order to avoid any potential type mistakes, in particular in the
      context in which this code is being used.
      
      So, replace code of the following form:
      
      sizeof(*s) + s->nkeys*sizeof(struct tc_u32_key)
      
      with:
      
      struct_size(s, keys, s->nkeys)
      
      This code was detected with the help of Coccinelle.
      Signed-off-by: NGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e512fcf0
    • C
      net: add a generic tracepoint for TX queue timeout · 141b6b2a
      Cong Wang 提交于
      Although devlink health report does a nice job on reporting TX
      timeout and other NIC errors, unfortunately it requires drivers
      to support it but currently only mlx5 has implemented it.
      Before other drivers could catch up, it is useful to have a
      generic tracepoint to monitor this kind of TX timeout. We have
      been suffering TX timeout with different drivers, we plan to
      start to monitor it with rasdaemon which just needs a new tracepoint.
      
      Sample output:
      
        ksoftirqd/1-16    [001] ..s2   144.043173: net_dev_xmit_timeout: dev=ens3 driver=e1000 queue=0
      
      Cc: Eran Ben Elisha <eranbe@mellanox.com>
      Cc: Jiri Pirko <jiri@mellanox.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NJiri Pirko <jiri@mellanox.com>
      Reviewed-by: NEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      141b6b2a
  4. 02 5月, 2019 4 次提交
    • E
      udp: fix GRO packet of death · 4dd2b82d
      Eric Dumazet 提交于
      syzbot was able to crash host by sending UDP packets with a 0 payload.
      
      TCP does not have this issue since we do not aggregate packets without
      payload.
      
      Since dev_gro_receive() sets gso_size based on skb_gro_len(skb)
      it seems not worth trying to cope with padded packets.
      
      BUG: KASAN: slab-out-of-bounds in skb_gro_receive+0xf5f/0x10e0 net/core/skbuff.c:3826
      Read of size 16 at addr ffff88808893fff0 by task syz-executor612/7889
      
      CPU: 0 PID: 7889 Comm: syz-executor612 Not tainted 5.1.0-rc7+ #96
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
       kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       __asan_report_load16_noabort+0x14/0x20 mm/kasan/generic_report.c:133
       skb_gro_receive+0xf5f/0x10e0 net/core/skbuff.c:3826
       udp_gro_receive_segment net/ipv4/udp_offload.c:382 [inline]
       call_gro_receive include/linux/netdevice.h:2349 [inline]
       udp_gro_receive+0xb61/0xfd0 net/ipv4/udp_offload.c:414
       udp4_gro_receive+0x763/0xeb0 net/ipv4/udp_offload.c:478
       inet_gro_receive+0xe72/0x1110 net/ipv4/af_inet.c:1510
       dev_gro_receive+0x1cd0/0x23c0 net/core/dev.c:5581
       napi_gro_frags+0x36b/0xd10 net/core/dev.c:5843
       tun_get_user+0x2f24/0x3fb0 drivers/net/tun.c:1981
       tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2027
       call_write_iter include/linux/fs.h:1866 [inline]
       do_iter_readv_writev+0x5e1/0x8e0 fs/read_write.c:681
       do_iter_write fs/read_write.c:957 [inline]
       do_iter_write+0x184/0x610 fs/read_write.c:938
       vfs_writev+0x1b3/0x2f0 fs/read_write.c:1002
       do_writev+0x15e/0x370 fs/read_write.c:1037
       __do_sys_writev fs/read_write.c:1110 [inline]
       __se_sys_writev fs/read_write.c:1107 [inline]
       __x64_sys_writev+0x75/0xb0 fs/read_write.c:1107
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x441cc0
      Code: 05 48 3d 01 f0 ff ff 0f 83 9d 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 3d 51 93 29 00 00 75 14 b8 14 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 74 09 fc ff c3 48 83 ec 08 e8 ba 2b 00 00
      RSP: 002b:00007ffe8c716118 EFLAGS: 00000246 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 00007ffe8c716150 RCX: 0000000000441cc0
      RDX: 0000000000000001 RSI: 00007ffe8c716170 RDI: 00000000000000f0
      RBP: 0000000000000000 R08: 000000000000ffff R09: 0000000000a64668
      R10: 0000000020000040 R11: 0000000000000246 R12: 000000000000c2d9
      R13: 0000000000402b50 R14: 0000000000000000 R15: 0000000000000000
      
      Allocated by task 5143:
       save_stack+0x45/0xd0 mm/kasan/common.c:75
       set_track mm/kasan/common.c:87 [inline]
       __kasan_kmalloc mm/kasan/common.c:497 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:470
       kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:505
       slab_post_alloc_hook mm/slab.h:437 [inline]
       slab_alloc mm/slab.c:3393 [inline]
       kmem_cache_alloc+0x11a/0x6f0 mm/slab.c:3555
       mm_alloc+0x1d/0xd0 kernel/fork.c:1030
       bprm_mm_init fs/exec.c:363 [inline]
       __do_execve_file.isra.0+0xaa3/0x23f0 fs/exec.c:1791
       do_execveat_common fs/exec.c:1865 [inline]
       do_execve fs/exec.c:1882 [inline]
       __do_sys_execve fs/exec.c:1958 [inline]
       __se_sys_execve fs/exec.c:1953 [inline]
       __x64_sys_execve+0x8f/0xc0 fs/exec.c:1953
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 5351:
       save_stack+0x45/0xd0 mm/kasan/common.c:75
       set_track mm/kasan/common.c:87 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:459
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:467
       __cache_free mm/slab.c:3499 [inline]
       kmem_cache_free+0x86/0x260 mm/slab.c:3765
       __mmdrop+0x238/0x320 kernel/fork.c:677
       mmdrop include/linux/sched/mm.h:49 [inline]
       finish_task_switch+0x47b/0x780 kernel/sched/core.c:2746
       context_switch kernel/sched/core.c:2880 [inline]
       __schedule+0x81b/0x1cc0 kernel/sched/core.c:3518
       preempt_schedule_irq+0xb5/0x140 kernel/sched/core.c:3745
       retint_kernel+0x1b/0x2d
       arch_local_irq_restore arch/x86/include/asm/paravirt.h:767 [inline]
       kmem_cache_free+0xab/0x260 mm/slab.c:3766
       anon_vma_chain_free mm/rmap.c:134 [inline]
       unlink_anon_vmas+0x2ba/0x870 mm/rmap.c:401
       free_pgtables+0x1af/0x2f0 mm/memory.c:394
       exit_mmap+0x2d1/0x530 mm/mmap.c:3144
       __mmput kernel/fork.c:1046 [inline]
       mmput+0x15f/0x4c0 kernel/fork.c:1067
       exec_mmap fs/exec.c:1046 [inline]
       flush_old_exec+0x8d9/0x1c20 fs/exec.c:1279
       load_elf_binary+0x9bc/0x53f0 fs/binfmt_elf.c:864
       search_binary_handler fs/exec.c:1656 [inline]
       search_binary_handler+0x17f/0x570 fs/exec.c:1634
       exec_binprm fs/exec.c:1698 [inline]
       __do_execve_file.isra.0+0x1394/0x23f0 fs/exec.c:1818
       do_execveat_common fs/exec.c:1865 [inline]
       do_execve fs/exec.c:1882 [inline]
       __do_sys_execve fs/exec.c:1958 [inline]
       __se_sys_execve fs/exec.c:1953 [inline]
       __x64_sys_execve+0x8f/0xc0 fs/exec.c:1953
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff88808893f7c0
       which belongs to the cache mm_struct of size 1496
      The buggy address is located 600 bytes to the right of
       1496-byte region [ffff88808893f7c0, ffff88808893fd98)
      The buggy address belongs to the page:
      page:ffffea0002224f80 count:1 mapcount:0 mapping:ffff88821bc40ac0 index:0xffff88808893f7c0 compound_mapcount: 0
      flags: 0x1fffc0000010200(slab|head)
      raw: 01fffc0000010200 ffffea00025b4f08 ffffea00027b9d08 ffff88821bc40ac0
      raw: ffff88808893f7c0 ffff88808893e440 0000000100000001 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88808893fe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff88808893ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff88808893ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                                                                   ^
       ffff888088940000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff888088940080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      
      Fixes: e20cf8d3 ("udp: implement GRO for plain UDP sockets.")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4dd2b82d
    • M
      ipv6: A few fixes on dereferencing rt->from · 886b7a50
      Martin KaFai Lau 提交于
      It is a followup after the fix in
      commit 9c69a132 ("route: Avoid crash from dereferencing NULL rt->from")
      
      rt6_do_redirect():
      1. NULL checking is needed on rt->from because a parallel
         fib6_info delete could happen that sets rt->from to NULL.
         (e.g. rt6_remove_exception() and fib6_drop_pcpu_from()).
      
      2. fib6_info_hold() is not enough.  Same reason as (1).
         Meaning, holding dst->__refcnt cannot ensure
         rt->from is not NULL or rt->from->fib6_ref is not 0.
      
         Instead of using fib6_info_hold_safe() which ip6_rt_cache_alloc()
         is already doing, this patch chooses to extend the rcu section
         to keep "from" dereference-able after checking for NULL.
      
      inet6_rtm_getroute():
      1. NULL checking is also needed on rt->from for a similar reason.
         Note that inet6_rtm_getroute() is using RTNL_FLAG_DOIT_UNLOCKED.
      
      Fixes: a68886a6 ("net/ipv6: Make from in rt6_info rcu protected")
      Signed-off-by: NMartin KaFai Lau <kafai@fb.com>
      Acked-by: NWei Wang <weiwan@google.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      886b7a50
    • N
      rds: ib: force endiannes annotation · f3505745
      Nicholas Mc Guire 提交于
      While the endiannes is being handled correctly as indicated by the comment
      above the offending line - sparse was unhappy with the missing annotation
      as be64_to_cpu() expects a __be64 argument. To mitigate this annotation
      all involved variables are changed to a consistent __le64 and the
       conversion to uint64_t delayed to the call to rds_cong_map_updated().
      Signed-off-by: NNicholas Mc Guire <hofrat@osadl.org>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3505745
    • S
      ipv4: ip_do_fragment: Preserve skb_iif during fragmentation · d2f0c961
      Shmulik Ladkani 提交于
      Previously, during fragmentation after forwarding, skb->skb_iif isn't
      preserved, i.e. 'ip_copy_metadata' does not copy skb_iif from given
      'from' skb.
      
      As a result, ip_do_fragment's creates fragments with zero skb_iif,
      leading to inconsistent behavior.
      
      Assume for example an eBPF program attached at tc egress (post
      forwarding) that examines __sk_buff->ingress_ifindex:
       - the correct iif is observed if forwarding path does not involve
         fragmentation/refragmentation
       - a bogus iif is observed if forwarding path involves
         fragmentation/refragmentatiom
      
      Fix, by preserving skb_iif during 'ip_copy_metadata'.
      Signed-off-by: NShmulik Ladkani <shmulik.ladkani@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d2f0c961
  5. 01 5月, 2019 8 次提交