1. 12 3月, 2020 3 次提交
  2. 11 3月, 2020 7 次提交
  3. 10 3月, 2020 1 次提交
    • D
      cgroup, netclassid: periodically release file_lock on classid updating · 018d26fc
      Dmitry Yakunin 提交于
      In our production environment we have faced with problem that updating
      classid in cgroup with heavy tasks cause long freeze of the file tables
      in this tasks. By heavy tasks we understand tasks with many threads and
      opened sockets (e.g. balancers). This freeze leads to an increase number
      of client timeouts.
      
      This patch implements following logic to fix this issue:
      аfter iterating 1000 file descriptors file table lock will be released
      thus providing a time gap for socket creation/deletion.
      
      Now update is non atomic and socket may be skipped using calls:
      
      dup2(oldfd, newfd);
      close(oldfd);
      
      But this case is not typical. Moreover before this patch skip is possible
      too by hiding socket fd in unix socket buffer.
      
      New sockets will be allocated with updated classid because cgroup state
      is updated before start of the file descriptors iteration.
      
      So in common cases this patch has no side effects.
      Signed-off-by: NDmitry Yakunin <zeil@yandex-team.ru>
      Reviewed-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      018d26fc
  4. 09 3月, 2020 2 次提交
    • D
      inet_diag: return classid for all socket types · 83f73c5b
      Dmitry Yakunin 提交于
      In commit 1ec17dbd ("inet_diag: fix reporting cgroup classid and
      fallback to priority") croup classid reporting was fixed. But this works
      only for TCP sockets because for other socket types icsk parameter can
      be NULL and classid code path is skipped. This change moves classid
      handling to inet_diag_msg_attrs_fill() function.
      
      Also inet_diag_msg_attrs_size() helper was added and addends in
      nlmsg_new() were reordered to save order from inet_sk_diag_fill().
      
      Fixes: 1ec17dbd ("inet_diag: fix reporting cgroup classid and fallback to priority")
      Signed-off-by: NDmitry Yakunin <zeil@yandex-team.ru>
      Reviewed-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      83f73c5b
    • E
      gre: fix uninit-value in __iptunnel_pull_header · 17c25caf
      Eric Dumazet 提交于
      syzbot found an interesting case of the kernel reading
      an uninit-value [1]
      
      Problem is in the handling of ETH_P_WCCP in gre_parse_header()
      
      We look at the byte following GRE options to eventually decide
      if the options are four bytes longer.
      
      Use skb_header_pointer() to not pull bytes if we found
      that no more bytes were needed.
      
      All callers of gre_parse_header() are properly using pskb_may_pull()
      anyway before proceeding to next header.
      
      [1]
      BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2303 [inline]
      BUG: KMSAN: uninit-value in __iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94
      CPU: 1 PID: 11784 Comm: syz-executor940 Not tainted 5.6.0-rc2-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x220 lib/dump_stack.c:118
       kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
       __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
       pskb_may_pull include/linux/skbuff.h:2303 [inline]
       __iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94
       iptunnel_pull_header include/net/ip_tunnels.h:411 [inline]
       gre_rcv+0x15e/0x19c0 net/ipv6/ip6_gre.c:606
       ip6_protocol_deliver_rcu+0x181b/0x22c0 net/ipv6/ip6_input.c:432
       ip6_input_finish net/ipv6/ip6_input.c:473 [inline]
       NF_HOOK include/linux/netfilter.h:307 [inline]
       ip6_input net/ipv6/ip6_input.c:482 [inline]
       ip6_mc_input+0xdf2/0x1460 net/ipv6/ip6_input.c:576
       dst_input include/net/dst.h:442 [inline]
       ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline]
       NF_HOOK include/linux/netfilter.h:307 [inline]
       ipv6_rcv+0x683/0x710 net/ipv6/ip6_input.c:306
       __netif_receive_skb_one_core net/core/dev.c:5198 [inline]
       __netif_receive_skb net/core/dev.c:5312 [inline]
       netif_receive_skb_internal net/core/dev.c:5402 [inline]
       netif_receive_skb+0x66b/0xf20 net/core/dev.c:5461
       tun_rx_batched include/linux/skbuff.h:4321 [inline]
       tun_get_user+0x6aef/0x6f60 drivers/net/tun.c:1997
       tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026
       call_write_iter include/linux/fs.h:1901 [inline]
       new_sync_write fs/read_write.c:483 [inline]
       __vfs_write+0xa5a/0xca0 fs/read_write.c:496
       vfs_write+0x44a/0x8f0 fs/read_write.c:558
       ksys_write+0x267/0x450 fs/read_write.c:611
       __do_sys_write fs/read_write.c:623 [inline]
       __se_sys_write fs/read_write.c:620 [inline]
       __ia32_sys_write+0xdb/0x120 fs/read_write.c:620
       do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline]
       do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410
       entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139
      RIP: 0023:0xf7f62d99
      Code: 90 e8 0b 00 00 00 f3 90 0f ae e8 eb f9 8d 74 26 00 89 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
      RSP: 002b:00000000fffedb2c EFLAGS: 00000217 ORIG_RAX: 0000000000000004
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020002580
      RDX: 0000000000000fca RSI: 0000000000000036 RDI: 0000000000000004
      RBP: 0000000000008914 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
       kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127
       kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82
       slab_alloc_node mm/slub.c:2793 [inline]
       __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401
       __kmalloc_reserve net/core/skbuff.c:142 [inline]
       __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210
       alloc_skb include/linux/skbuff.h:1051 [inline]
       alloc_skb_with_frags+0x18c/0xa70 net/core/skbuff.c:5766
       sock_alloc_send_pskb+0xada/0xc60 net/core/sock.c:2242
       tun_alloc_skb drivers/net/tun.c:1529 [inline]
       tun_get_user+0x10ae/0x6f60 drivers/net/tun.c:1843
       tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026
       call_write_iter include/linux/fs.h:1901 [inline]
       new_sync_write fs/read_write.c:483 [inline]
       __vfs_write+0xa5a/0xca0 fs/read_write.c:496
       vfs_write+0x44a/0x8f0 fs/read_write.c:558
       ksys_write+0x267/0x450 fs/read_write.c:611
       __do_sys_write fs/read_write.c:623 [inline]
       __se_sys_write fs/read_write.c:620 [inline]
       __ia32_sys_write+0xdb/0x120 fs/read_write.c:620
       do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline]
       do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410
       entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139
      
      Fixes: 95f5c64c ("gre: Move utility functions to common headers")
      Fixes: c5441932 ("GRE: Refactor GRE tunneling code.")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      17c25caf
  5. 07 3月, 2020 1 次提交
    • P
      netfilter: nft_chain_nat: inet family is missing module ownership · 6a42cefb
      Pablo Neira Ayuso 提交于
      Set owner to THIS_MODULE, otherwise the nft_chain_nat module might be
      removed while there are still inet/nat chains in place.
      
      [  117.942096] BUG: unable to handle page fault for address: ffffffffa0d5e040
      [  117.942101] #PF: supervisor read access in kernel mode
      [  117.942103] #PF: error_code(0x0000) - not-present page
      [  117.942106] PGD 200c067 P4D 200c067 PUD 200d063 PMD 3dc909067 PTE 0
      [  117.942113] Oops: 0000 [#1] PREEMPT SMP PTI
      [  117.942118] CPU: 3 PID: 27 Comm: kworker/3:0 Not tainted 5.6.0-rc3+ #348
      [  117.942133] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
      [  117.942145] RIP: 0010:nf_tables_chain_destroy.isra.0+0x94/0x15a [nf_tables]
      [  117.942149] Code: f6 45 54 01 0f 84 d1 00 00 00 80 3b 05 74 44 48 8b 75 e8 48 c7 c7 72 be de a0 e8 56 e6 2d e0 48 8b 45 e8 48 c7 c7 7f be de a0 <48> 8b 30 e8 43 e6 2d e0 48 8b 45 e8 48 8b 40 10 48 85 c0 74 5b 8b
      [  117.942152] RSP: 0018:ffffc9000015be10 EFLAGS: 00010292
      [  117.942155] RAX: ffffffffa0d5e040 RBX: ffff88840be87fc2 RCX: 0000000000000007
      [  117.942158] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffffffffa0debe7f
      [  117.942160] RBP: ffff888403b54b50 R08: 0000000000001482 R09: 0000000000000004
      [  117.942162] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8883eda7e540
      [  117.942164] R13: dead000000000122 R14: dead000000000100 R15: ffff888403b3db80
      [  117.942167] FS:  0000000000000000(0000) GS:ffff88840e4c0000(0000) knlGS:0000000000000000
      [  117.942169] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  117.942172] CR2: ffffffffa0d5e040 CR3: 00000003e4c52002 CR4: 00000000001606e0
      [  117.942174] Call Trace:
      [  117.942188]  nf_tables_trans_destroy_work.cold+0xd/0x12 [nf_tables]
      [  117.942196]  process_one_work+0x1d6/0x3b0
      [  117.942200]  worker_thread+0x45/0x3c0
      [  117.942203]  ? process_one_work+0x3b0/0x3b0
      [  117.942210]  kthread+0x112/0x130
      [  117.942214]  ? kthread_create_worker_on_cpu+0x40/0x40
      [  117.942221]  ret_from_fork+0x35/0x40
      
      nf_tables_chain_destroy() crashes on module_put() because the module is
      gone.
      
      Fixes: d164385e ("netfilter: nat: add inet family nat support")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      6a42cefb
  6. 06 3月, 2020 2 次提交
  7. 05 3月, 2020 2 次提交
    • F
      netfilter: nf_tables: fix infinite loop when expr is not available · 1d305ba4
      Florian Westphal 提交于
      nft will loop forever if the kernel doesn't support an expression:
      
      1. nft_expr_type_get() appends the family specific name to the module list.
      2. -EAGAIN is returned to nfnetlink, nfnetlink calls abort path.
      3. abort path sets ->done to true and calls request_module for the
         expression.
      4. nfnetlink replays the batch, we end up in nft_expr_type_get() again.
      5. nft_expr_type_get attempts to append family-specific name. This
         one already exists on the list, so we continue
      6. nft_expr_type_get adds the generic expression name to the module
         list. -EAGAIN is returned, nfnetlink calls abort path.
      7. abort path encounters the family-specific expression which
         has 'done' set, so it gets removed.
      8. abort path requests the generic expression name, sets done to true.
      9. batch is replayed.
      
      If the expression could not be loaded, then we will end up back at 1),
      because the family-specific name got removed and the cycle starts again.
      
      Note that userspace can SIGKILL the nft process to stop the cycle, but
      the desired behaviour is to return an error after the generic expr name
      fails to load the expression.
      
      Fixes: eb014de4 ("netfilter: nf_tables: autoload modules from the abort path")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      1d305ba4
    • P
      netfilter: nf_tables: dump NFTA_CHAIN_FLAGS attribute · d78008de
      Pablo Neira Ayuso 提交于
      Missing NFTA_CHAIN_FLAGS netlink attribute when dumping basechain
      definitions.
      
      Fixes: c9626a2c ("netfilter: nf_tables: add hardware offload support")
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      d78008de
  8. 04 3月, 2020 22 次提交