1. 17 1月, 2018 2 次提交
  2. 16 1月, 2018 11 次提交
  3. 11 1月, 2018 8 次提交
  4. 10 1月, 2018 7 次提交
    • S
      xfrm: Fix a race in the xdst pcpu cache. · 76a42011
      Steffen Klassert 提交于
      We need to run xfrm_resolve_and_create_bundle() with
      bottom halves off. Otherwise we may reuse an already
      released dst_enty when the xfrm lookup functions are
      called from process context.
      
      Fixes: c30d78c14a813db39a647b6a348b428 ("xfrm: add xdst pcpu cache")
      Reported-by: NDarius Ski <darius.ski@gmail.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      76a42011
    • S
      af_key: Fix memory leak in key_notify_policy. · 1e532d2b
      Steffen Klassert 提交于
      We leak the allocated out_skb in case
      pfkey_xfrm_policy2msg() fails. Fix this
      by freeing it on error.
      Reported-by: NDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      1e532d2b
    • A
      bpf: introduce BPF_JIT_ALWAYS_ON config · 290af866
      Alexei Starovoitov 提交于
      The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
      
      A quote from goolge project zero blog:
      "At this point, it would normally be necessary to locate gadgets in
      the host kernel code that can be used to actually leak data by reading
      from an attacker-controlled location, shifting and masking the result
      appropriately and then using the result of that as offset to an
      attacker-controlled address for a load. But piecing gadgets together
      and figuring out which ones work in a speculation context seems annoying.
      So instead, we decided to use the eBPF interpreter, which is built into
      the host kernel - while there is no legitimate way to invoke it from inside
      a VM, the presence of the code in the host kernel's text section is sufficient
      to make it usable for the attack, just like with ordinary ROP gadgets."
      
      To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
      option that removes interpreter from the kernel in favor of JIT-only mode.
      So far eBPF JIT is supported by:
      x64, arm64, arm32, sparc64, s390, powerpc64, mips64
      
      The start of JITed program is randomized and code page is marked as read-only.
      In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
      
      v2->v3:
      - move __bpf_prog_ret0 under ifdef (Daniel)
      
      v1->v2:
      - fix init order, test_bpf and cBPF (Daniel's feedback)
      - fix offloaded bpf (Jakub's feedback)
      - add 'return 0' dummy in case something can invoke prog->bpf_func
      - retarget bpf tree. For bpf-next the patch would need one extra hunk.
        It will be sent when the trees are merged back to net-next
      
      Considered doing:
        int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
      but it seems better to land the patch as-is and in bpf-next remove
      bpf_jit_enable global variable from all JITs, consolidate in one place
      and remove this jit_init() function.
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: NDaniel Borkmann <daniel@iogearbox.net>
      290af866
    • W
      ipv6: remove null_entry before adding default route · 4512c43e
      Wei Wang 提交于
      In the current code, when creating a new fib6 table, tb6_root.leaf gets
      initialized to net->ipv6.ip6_null_entry.
      If a default route is being added with rt->rt6i_metric = 0xffffffff,
      fib6_add() will add this route after net->ipv6.ip6_null_entry. As
      null_entry is shared, it could cause problem.
      
      In order to fix it, set fn->leaf to NULL before calling
      fib6_add_rt2node() when trying to add the first default route.
      And reset fn->leaf to null_entry when adding fails or when deleting the
      last default route.
      
      syzkaller reported the following issue which is fixed by this commit:
      
      WARNING: suspicious RCU usage
      4.15.0-rc5+ #171 Not tainted
      -----------------------------
      net/ipv6/ip6_fib.c:1702 suspicious rcu_dereference_protected() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      4 locks held by swapper/0/0:
       #0:  ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
       #0:  ((&net->ipv6.ip6_fib_timer)){+.-.}, at: [<00000000d43f631b>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1310
       #1:  (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
       #1:  (&(&net->ipv6.fib6_gc_lock)->rlock){+.-.}, at: [<000000002ff9d65c>] fib6_run_gc+0x9d/0x3c0 net/ipv6/ip6_fib.c:2007
       #2:  (rcu_read_lock){....}, at: [<0000000091db762d>] __fib6_clean_all+0x0/0x3a0 net/ipv6/ip6_fib.c:1560
       #3:  (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] spin_lock_bh include/linux/spinlock.h:315 [inline]
       #3:  (&(&tb->tb6_lock)->rlock){+.-.}, at: [<000000009e503581>] __fib6_clean_all+0x1d0/0x3a0 net/ipv6/ip6_fib.c:1948
      
      stack backtrace:
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc5+ #171
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:53
       lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585
       fib6_del+0xcaa/0x11b0 net/ipv6/ip6_fib.c:1701
       fib6_clean_node+0x3aa/0x4f0 net/ipv6/ip6_fib.c:1892
       fib6_walk_continue+0x46c/0x8a0 net/ipv6/ip6_fib.c:1815
       fib6_walk+0x91/0xf0 net/ipv6/ip6_fib.c:1863
       fib6_clean_tree+0x1e6/0x340 net/ipv6/ip6_fib.c:1933
       __fib6_clean_all+0x1f4/0x3a0 net/ipv6/ip6_fib.c:1949
       fib6_clean_all net/ipv6/ip6_fib.c:1960 [inline]
       fib6_run_gc+0x16b/0x3c0 net/ipv6/ip6_fib.c:2016
       fib6_gc_timer_cb+0x20/0x30 net/ipv6/ip6_fib.c:2033
       call_timer_fn+0x228/0x820 kernel/time/timer.c:1320
       expire_timers kernel/time/timer.c:1357 [inline]
       __run_timers+0x7ee/0xb70 kernel/time/timer.c:1660
       run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
       __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
       invoke_softirq kernel/softirq.c:365 [inline]
       irq_exit+0x1cc/0x200 kernel/softirq.c:405
       exiting_irq arch/x86/include/asm/apic.h:540 [inline]
       smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
       apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:904
       </IRQ>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Fixes: 66f5d6ce ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
      Signed-off-by: NWei Wang <weiwan@google.com>
      Acked-by: NMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4512c43e
    • N
      net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg() · 20b50d79
      Nicolai Stange 提交于
      Commit 8f659a03 ("net: ipv4: fix for a race condition in
      raw_sendmsg") fixed the issue of possibly inconsistent ->hdrincl handling
      due to concurrent updates by reading this bit-field member into a local
      variable and using the thus stabilized value in subsequent tests.
      
      However, aforementioned commit also adds the (correct) comment that
      
        /* hdrincl should be READ_ONCE(inet->hdrincl)
         * but READ_ONCE() doesn't work with bit fields
         */
      
      because as it stands, the compiler is free to shortcut or even eliminate
      the local variable at its will.
      
      Note that I have not seen anything like this happening in reality and thus,
      the concern is a theoretical one.
      
      However, in order to be on the safe side, emulate a READ_ONCE() on the
      bit-field by doing it on the local 'hdrincl' variable itself:
      
      	int hdrincl = inet->hdrincl;
      	hdrincl = READ_ONCE(hdrincl);
      
      This breaks the chain in the sense that the compiler is not allowed
      to replace subsequent reads from hdrincl with reloads from inet->hdrincl.
      
      Fixes: 8f659a03 ("net: ipv4: fix for a race condition in raw_sendmsg")
      Signed-off-by: NNicolai Stange <nstange@suse.de>
      Reviewed-by: NStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      20b50d79
    • X
      net: caif: use strlcpy() instead of strncpy() · 3dc2fa47
      Xiongfeng Wang 提交于
      gcc-8 reports
      
      net/caif/caif_dev.c: In function 'caif_enroll_dev':
      ./include/linux/string.h:245:9: warning: '__builtin_strncpy' output may
      be truncated copying 15 bytes from a string of length 15
      [-Wstringop-truncation]
      
      net/caif/cfctrl.c: In function 'cfctrl_linkup_request':
      ./include/linux/string.h:245:9: warning: '__builtin_strncpy' output may
      be truncated copying 15 bytes from a string of length 15
      [-Wstringop-truncation]
      
      net/caif/cfcnfg.c: In function 'caif_connect_client':
      ./include/linux/string.h:245:9: warning: '__builtin_strncpy' output may
      be truncated copying 15 bytes from a string of length 15
      [-Wstringop-truncation]
      
      The compiler require that the input param 'len' of strncpy() should be
      greater than the length of the src string, so that '\0' is copied as
      well. We can just use strlcpy() to avoid this warning.
      Signed-off-by: NXiongfeng Wang <xiongfeng.wang@linaro.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3dc2fa47
    • A
      net: core: fix module type in sock_diag_bind · b8fd0823
      Andrii Vladyka 提交于
      Use AF_INET6 instead of AF_INET in IPv6-related code path
      Signed-off-by: NAndrii Vladyka <tulup@mail.ru>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b8fd0823
  5. 09 1月, 2018 3 次提交
    • S
      esp: Fix GRO when the headers not fully in the linear part of the skb. · 374d1b5a
      Steffen Klassert 提交于
      The GRO layer does not necessarily pull the complete headers
      into the linear part of the skb, a part may remain on the
      first page fragment. This can lead to a crash if we try to
      pull the headers, so make sure we have them on the linear
      part before pulling.
      
      Fixes: 7785bba2 ("esp: Add a software GRO codepath")
      Reported-by: syzbot+82bbd65569c49c6c0c4d@syzkaller.appspotmail.com
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      374d1b5a
    • M
      sctp: fix the handling of ICMP Frag Needed for too small MTUs · b6c5734d
      Marcelo Ricardo Leitner 提交于
      syzbot reported a hang involving SCTP, on which it kept flooding dmesg
      with the message:
      [  246.742374] sctp: sctp_transport_update_pmtu: Reported pmtu 508 too
      low, using default minimum of 512
      
      That happened because whenever SCTP hits an ICMP Frag Needed, it tries
      to adjust to the new MTU and triggers an immediate retransmission. But
      it didn't consider the fact that MTUs smaller than the SCTP minimum MTU
      allowed (512) would not cause the PMTU to change, and issued the
      retransmission anyway (thus leading to another ICMP Frag Needed, and so
      on).
      
      As IPv4 (ip_rt_min_pmtu=556) and IPv6 (IPV6_MIN_MTU=1280) minimum MTU
      are higher than that, sctp_transport_update_pmtu() is changed to
      re-fetch the PMTU that got set after our request, and with that, detect
      if there was an actual change or not.
      
      The fix, thus, skips the immediate retransmission if the received ICMP
      resulted in no change, in the hope that SCTP will select another path.
      
      Note: The value being used for the minimum MTU (512,
      SCTP_DEFAULT_MINSEGMENT) is not right and instead it should be (576,
      SCTP_MIN_PMTU), but such change belongs to another patch.
      
      Changes from v1:
      - do not disable PMTU discovery, in the light of commit
      06ad3919 ("[SCTP] Don't disable PMTU discovery when mtu is small")
      and as suggested by Xin Long.
      - changed the way to break the rtx loop by detecting if the icmp
        resulted in a change or not
      Changes from v2:
      none
      
      See-also: https://lkml.org/lkml/2017/12/22/811Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b6c5734d
    • M
      sctp: do not retransmit upon FragNeeded if PMTU discovery is disabled · cc35c3d1
      Marcelo Ricardo Leitner 提交于
      Currently, if PMTU discovery is disabled on a given transport, but the
      configured value is higher than the actual PMTU, it is likely that we
      will get some icmp Frag Needed. The issue is, if PMTU discovery is
      disabled, we won't update the information and will issue a
      retransmission immediately, which may very well trigger another ICMP,
      and another retransmission, leading to a loop.
      
      The fix is to simply not trigger immediate retransmissions if PMTU
      discovery is disabled on the given transport.
      
      Changes from v2:
      - updated stale comment, noticed by Xin Long
      Signed-off-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cc35c3d1
  6. 08 1月, 2018 2 次提交
  7. 06 1月, 2018 1 次提交
  8. 05 1月, 2018 4 次提交
    • H
      xfrm: Use __skb_queue_tail in xfrm_trans_queue · d16b46e4
      Herbert Xu 提交于
      We do not need locking in xfrm_trans_queue because it is designed
      to use per-CPU buffers.  However, the original code incorrectly
      used skb_queue_tail which takes the lock.  This patch switches
      it to __skb_queue_tail instead.
      Reported-and-tested-by: NArtem Savkov <asavkov@redhat.com>
      Fixes: acf568ee ("xfrm: Reinject transport-mode packets...")
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NSteffen Klassert <steffen.klassert@secunet.com>
      d16b46e4
    • W
      ipv6: fix general protection fault in fib6_add() · 7bbfe00e
      Wei Wang 提交于
      In fib6_add(), pn could be NULL if fib6_add_1() failed to return a fib6
      node. Checking pn != fn before accessing pn->leaf makes sure pn is not
      NULL.
      This fixes the following GPF reported by syzkaller:
      general protection fault: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 0 PID: 3201 Comm: syzkaller001778 Not tainted 4.15.0-rc5+ #151
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:fib6_add+0x736/0x15a0 net/ipv6/ip6_fib.c:1244
      RSP: 0018:ffff8801c7626a70 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: 0000000000000020 RCX: ffffffff84794465
      RDX: 0000000000000004 RSI: ffff8801d38935f0 RDI: 0000000000000282
      RBP: ffff8801c7626da0 R08: 1ffff10038ec4c35 R09: 0000000000000000
      R10: ffff8801c7626c68 R11: 0000000000000000 R12: 00000000fffffffe
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000009
      FS:  0000000000000000(0000) GS:ffff8801db200000(0063) knlGS:0000000009b70840
      CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
      CR2: 0000000020be1000 CR3: 00000001d585a006 CR4: 00000000001606f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       __ip6_ins_rt+0x6c/0x90 net/ipv6/route.c:1006
       ip6_route_multipath_add+0xd14/0x16c0 net/ipv6/route.c:3833
       inet6_rtm_newroute+0xdc/0x160 net/ipv6/route.c:3957
       rtnetlink_rcv_msg+0x733/0x1020 net/core/rtnetlink.c:4411
       netlink_rcv_skb+0x21e/0x460 net/netlink/af_netlink.c:2408
       rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4423
       netlink_unicast_kernel net/netlink/af_netlink.c:1275 [inline]
       netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1301
       netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1864
       sock_sendmsg_nosec net/socket.c:636 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:646
       sock_write_iter+0x31a/0x5d0 net/socket.c:915
       call_write_iter include/linux/fs.h:1772 [inline]
       do_iter_readv_writev+0x525/0x7f0 fs/read_write.c:653
       do_iter_write+0x154/0x540 fs/read_write.c:932
       compat_writev+0x225/0x420 fs/read_write.c:1246
       do_compat_writev+0x115/0x220 fs/read_write.c:1267
       C_SYSC_writev fs/read_write.c:1278 [inline]
       compat_SyS_writev+0x26/0x30 fs/read_write.c:1274
       do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
       do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
       entry_SYSENTER_compat+0x54/0x63 arch/x86/entry/entry_64_compat.S:125
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Fixes: 66f5d6ce ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
      Signed-off-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7bbfe00e
    • M
      RDS: null pointer dereference in rds_atomic_free_op · 7d11f77f
      Mohamed Ghannam 提交于
      set rm->atomic.op_active to 0 when rds_pin_pages() fails
      or the user supplied address is invalid,
      this prevents a NULL pointer usage in rds_atomic_free_op()
      Signed-off-by: NMohamed Ghannam <simo.ghannam@gmail.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7d11f77f
    • A
      rtnetlink: give a user socket to get_target_net() · f428fe4a
      Andrei Vagin 提交于
      This function is used from two places: rtnl_dump_ifinfo and
      rtnl_getlink. In rtnl_getlink(), we give a request skb into
      get_target_net(), but in rtnl_dump_ifinfo, we give a response skb
      into get_target_net().
      The problem here is that NETLINK_CB() isn't initialized for the response
      skb. In both cases we can get a user socket and give it instead of skb
      into get_target_net().
      
      This bug was found by syzkaller with this call-trace:
      
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN
      Modules linked in:
      CPU: 1 PID: 3149 Comm: syzkaller140561 Not tainted 4.15.0-rc4-mm1+ #47
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      RIP: 0010:__netlink_ns_capable+0x8b/0x120 net/netlink/af_netlink.c:868
      RSP: 0018:ffff8801c880f348 EFLAGS: 00010206
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff8443f900
      RDX: 000000000000007b RSI: ffffffff86510f40 RDI: 00000000000003d8
      RBP: ffff8801c880f360 R08: 0000000000000000 R09: 1ffff10039101e4f
      R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff86510f40
      R13: 000000000000000c R14: 0000000000000004 R15: 0000000000000011
      FS:  0000000001a1a880(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020151000 CR3: 00000001c9511005 CR4: 00000000001606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
        netlink_ns_capable+0x26/0x30 net/netlink/af_netlink.c:886
        get_target_net+0x9d/0x120 net/core/rtnetlink.c:1765
        rtnl_dump_ifinfo+0x2e5/0xee0 net/core/rtnetlink.c:1806
        netlink_dump+0x48c/0xce0 net/netlink/af_netlink.c:2222
        __netlink_dump_start+0x4f0/0x6d0 net/netlink/af_netlink.c:2319
        netlink_dump_start include/linux/netlink.h:214 [inline]
        rtnetlink_rcv_msg+0x7f0/0xb10 net/core/rtnetlink.c:4485
        netlink_rcv_skb+0x21e/0x460 net/netlink/af_netlink.c:2441
        rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4540
        netlink_unicast_kernel net/netlink/af_netlink.c:1308 [inline]
        netlink_unicast+0x4be/0x6a0 net/netlink/af_netlink.c:1334
        netlink_sendmsg+0xa4a/0xe60 net/netlink/af_netlink.c:1897
      
      Cc: Jiri Benc <jbenc@redhat.com>
      Fixes: 79e1ad14 ("rtnetlink: use netnsid to query interface")
      Signed-off-by: NAndrei Vagin <avagin@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f428fe4a
  9. 04 1月, 2018 2 次提交