1. 23 1月, 2020 10 次提交
    • M
      airo: Add missing CAP_NET_ADMIN check in AIROOLDIOCTL/SIOCDEVPRIVATE · 78f7a756
      Michael Ellerman 提交于
      The driver for Cisco Aironet 4500 and 4800 series cards (airo.c),
      implements AIROOLDIOCTL/SIOCDEVPRIVATE in airo_ioctl().
      
      The ioctl handler copies an aironet_ioctl struct from userspace, which
      includes a command. Some of the commands are handled in readrids(),
      where the user controlled command is converted into a driver-internal
      value called "ridcode".
      
      There are two command values, AIROGWEPKTMP and AIROGWEPKNV, which
      correspond to ridcode values of RID_WEP_TEMP and RID_WEP_PERM
      respectively. These commands both have checks that the user has
      CAP_NET_ADMIN, with the comment that "Only super-user can read WEP
      keys", otherwise they return -EPERM.
      
      However there is another command value, AIRORRID, that lets the user
      specify the ridcode value directly, with no other checks. This means
      the user can bypass the CAP_NET_ADMIN check on AIROGWEPKTMP and
      AIROGWEPKNV.
      
      Fix it by moving the CAP_NET_ADMIN check out of the command handling
      and instead do it later based on the ridcode. That way regardless of
      whether the ridcode is set via AIROGWEPKTMP or AIROGWEPKNV, or passed
      in using AIRORID, we always do the CAP_NET_ADMIN check.
      
      Found by Ilja by code inspection, not tested as I don't have the
      required hardware.
      Reported-by: NIlja Van Sprundel <ivansprundel@ioactive.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      78f7a756
    • M
      airo: Fix possible info leak in AIROOLDIOCTL/SIOCDEVPRIVATE · d6bce213
      Michael Ellerman 提交于
      The driver for Cisco Aironet 4500 and 4800 series cards (airo.c),
      implements AIROOLDIOCTL/SIOCDEVPRIVATE in airo_ioctl().
      
      The ioctl handler copies an aironet_ioctl struct from userspace, which
      includes a command and a length. Some of the commands are handled in
      readrids(), which kmalloc()'s a buffer of RIDSIZE (2048) bytes.
      
      That buffer is then passed to PC4500_readrid(), which has two cases.
      The else case does some setup and then reads up to RIDSIZE bytes from
      the hardware into the kmalloc()'ed buffer.
      
      Here len == RIDSIZE, pBuf is the kmalloc()'ed buffer:
      
      	// read the rid length field
      	bap_read(ai, pBuf, 2, BAP1);
      	// length for remaining part of rid
      	len = min(len, (int)le16_to_cpu(*(__le16*)pBuf)) - 2;
      	...
      	// read remainder of the rid
      	rc = bap_read(ai, ((__le16*)pBuf)+1, len, BAP1);
      
      PC4500_readrid() then returns to readrids() which does:
      
      	len = comp->len;
      	if (copy_to_user(comp->data, iobuf, min(len, (int)RIDSIZE))) {
      
      Where comp->len is the user controlled length field.
      
      So if the "rid length field" returned by the hardware is < 2048, and
      the user requests 2048 bytes in comp->len, we will leak the previous
      contents of the kmalloc()'ed buffer to userspace.
      
      Fix it by kzalloc()'ing the buffer.
      
      Found by Ilja by code inspection, not tested as I don't have the
      required hardware.
      Reported-by: NIlja Van Sprundel <ivansprundel@ioactive.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d6bce213
    • A
      MAINTAINERS: Make Russell King designated reviewer of phylib · 3adb4eaa
      Andrew Lunn 提交于
      phylink and phylib are interconnected. It makes sense for phylib and
      phy driver patches to be also reviewed by the phylink maintainer.
      So add Russell King as a designed reviewer of phylib.
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Acked-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      3adb4eaa
    • W
      net, ip6_tunnel: fix namespaces move · 5311a69a
      William Dauchy 提交于
      in the same manner as commit d0f41851 ("net, ip_tunnel: fix
      namespaces move"), fix namespace moving as it was broken since commit
      8d79266b ("ip6_tunnel: add collect_md mode to IPv6 tunnel"), but for
      ipv6 this time; there is no reason to keep it for ip6_tunnel.
      
      Fixes: 8d79266b ("ip6_tunnel: add collect_md mode to IPv6 tunnel")
      Signed-off-by: NWilliam Dauchy <w.dauchy@criteo.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5311a69a
    • E
      net_sched: use validated TCA_KIND attribute in tc_new_tfilter() · 36d79af7
      Eric Dumazet 提交于
      sysbot found another issue in tc_new_tfilter().
      We probably should use @name which contains the sanitized
      version of TCA_KIND.
      
      BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:608 [inline]
      BUG: KMSAN: uninit-value in string+0x522/0x690 lib/vsprintf.c:689
      CPU: 1 PID: 10753 Comm: syz-executor.1 Not tainted 5.5.0-rc5-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x220 lib/dump_stack.c:118
       kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
       __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
       string_nocheck lib/vsprintf.c:608 [inline]
       string+0x522/0x690 lib/vsprintf.c:689
       vsnprintf+0x207d/0x31b0 lib/vsprintf.c:2574
       __request_module+0x2ad/0x11c0 kernel/kmod.c:143
       tcf_proto_lookup_ops+0x241/0x720 net/sched/cls_api.c:139
       tcf_proto_create net/sched/cls_api.c:262 [inline]
       tc_new_tfilter+0x2a4e/0x5010 net/sched/cls_api.c:2058
       rtnetlink_rcv_msg+0xcb7/0x1570 net/core/rtnetlink.c:5415
       netlink_rcv_skb+0x451/0x650 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5442
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0xf9e/0x1100 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x1248/0x14d0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg net/socket.c:659 [inline]
       ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330
       ___sys_sendmsg net/socket.c:2384 [inline]
       __sys_sendmsg+0x451/0x5f0 net/socket.c:2417
       __do_sys_sendmsg net/socket.c:2426 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
       do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45b349
      Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f88b3948c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f88b39496d4 RCX: 000000000045b349
      RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003
      RBP: 000000000075bfc8 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 000000000000099f R14: 00000000004cb163 R15: 000000000075bfd4
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
       kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127
       kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82
       slab_alloc_node mm/slub.c:2774 [inline]
       __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4382
       __kmalloc_reserve net/core/skbuff.c:141 [inline]
       __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:209
       alloc_skb include/linux/skbuff.h:1049 [inline]
       netlink_alloc_large_skb net/netlink/af_netlink.c:1174 [inline]
       netlink_sendmsg+0x7d3/0x14d0 net/netlink/af_netlink.c:1892
       sock_sendmsg_nosec net/socket.c:639 [inline]
       sock_sendmsg net/socket.c:659 [inline]
       ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2330
       ___sys_sendmsg net/socket.c:2384 [inline]
       __sys_sendmsg+0x451/0x5f0 net/socket.c:2417
       __do_sys_sendmsg net/socket.c:2426 [inline]
       __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424
       __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424
       do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 6f96c3c6 ("net_sched: fix backward compatibility for TCA_KIND")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      36d79af7
    • P
      Revert "udp: do rmem bulk free even if the rx sk queue is empty" · d39ca259
      Paolo Abeni 提交于
      This reverts commit 0d4a6608.
      
      Williem reported that after commit 0d4a6608 ("udp: do rmem bulk
      free even if the rx sk queue is empty") the memory allocated by
      an almost idle system with many UDP sockets can grow a lot.
      
      For stable kernel keep the solution as simple as possible and revert
      the offending commit.
      Reported-by: NWillem de Bruijn <willemdebruijn.kernel@gmail.com>
      Diagnosed-by: NEric Dumazet <eric.dumazet@gmail.com>
      Fixes: 0d4a6608 ("udp: do rmem bulk free even if the rx sk queue is empty")
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d39ca259
    • D
    • M
      net: Fix packet reordering caused by GRO and listified RX cooperation · c8079432
      Maxim Mikityanskiy 提交于
      Commit 323ebb61 ("net: use listified RX for handling GRO_NORMAL
      skbs") introduces batching of GRO_NORMAL packets in napi_frags_finish,
      and commit 6570bc79 ("net: core: use listified Rx for GRO_NORMAL in
      napi_gro_receive()") adds the same to napi_skb_finish. However,
      dev_gro_receive (that is called just before napi_{frags,skb}_finish) can
      also pass skbs to the networking stack: e.g., when the GRO session is
      flushed, napi_gro_complete is called, which passes pp directly to
      netif_receive_skb_internal, skipping napi->rx_list. It means that the
      packet stored in pp will be handled by the stack earlier than the
      packets that arrived before, but are still waiting in napi->rx_list. It
      leads to TCP reorderings that can be observed in the TCPOFOQueue counter
      in netstat.
      
      This commit fixes the reordering issue by making napi_gro_complete also
      use napi->rx_list, so that all packets going through GRO will keep their
      order. In order to keep napi_gro_flush working properly, gro_normal_list
      calls are moved after the flush to clear napi->rx_list.
      
      iwlwifi calls napi_gro_flush directly and does the same thing that is
      done by gro_normal_list, so the same change is applied there:
      napi_gro_flush is moved to be before the flush of napi->rx_list.
      
      A few other drivers also use napi_gro_flush (brocade/bna/bnad.c,
      cortina/gemini.c, hisilicon/hns3/hns3_enet.c). The first two also use
      napi_complete_done afterwards, which performs the gro_normal_list flush,
      so they are fine. The latter calls napi_gro_receive right after
      napi_gro_flush, so it can end up with non-empty napi->rx_list anyway.
      
      Fixes: 323ebb61 ("net: use listified RX for handling GRO_NORMAL skbs")
      Signed-off-by: NMaxim Mikityanskiy <maximmi@mellanox.com>
      Cc: Alexander Lobakin <alobakin@dlink.ru>
      Cc: Edward Cree <ecree@solarflare.com>
      Acked-by: NAlexander Lobakin <alobakin@dlink.ru>
      Acked-by: NSaeed Mahameed <saeedm@mellanox.com>
      Acked-by: NEdward Cree <ecree@solarflare.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c8079432
    • R
      can, slip: Protect tty->disc_data in write_wakeup and close with RCU · 0ace17d5
      Richard Palethorpe 提交于
      write_wakeup can happen in parallel with close/hangup where tty->disc_data
      is set to NULL and the netdevice is freed thus also freeing
      disc_data. write_wakeup accesses disc_data so we must prevent close from
      freeing the netdev while write_wakeup has a non-NULL view of
      tty->disc_data.
      
      We also need to make sure that accesses to disc_data are atomic. Which can
      all be done with RCU.
      
      This problem was found by Syzkaller on SLCAN, but the same issue is
      reproducible with the SLIP line discipline using an LTP test based on the
      Syzkaller reproducer.
      
      A fix which didn't use RCU was posted by Hillf Danton.
      
      Fixes: 661f7fda ("slip: Fix deadlock in write_wakeup")
      Fixes: a8e83b17 ("slcan: Port write_wakeup deadlock fix from slip")
      Reported-by: syzbot+017e491ae13c0068598a@syzkaller.appspotmail.com
      Signed-off-by: NRichard Palethorpe <rpalethorpe@suse.com>
      Cc: Wolfgang Grandegger <wg@grandegger.com>
      Cc: Marc Kleine-Budde <mkl@pengutronix.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Tyler Hall <tylerwhall@gmail.com>
      Cc: linux-can@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Cc: syzkaller@googlegroups.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0ace17d5
    • J
      net, sk_msg: Don't check if sock is locked when tearing down psock · 58c8db92
      Jakub Sitnicki 提交于
      As John Fastabend reports [0], psock state tear-down can happen on receive
      path *after* unlocking the socket, if the only other psock user, that is
      sockmap or sockhash, releases its psock reference before tcp_bpf_recvmsg
      does so:
      
       tcp_bpf_recvmsg()
        psock = sk_psock_get(sk)                         <- refcnt 2
        lock_sock(sk);
        ...
                                        sock_map_free()  <- refcnt 1
        release_sock(sk)
        sk_psock_put()                                   <- refcnt 0
      
      Remove the lockdep check for socket lock in psock tear-down that got
      introduced in 7e81a353 ("bpf: Sockmap, ensure sock lock held during
      tear down").
      
      [0] https://lore.kernel.org/netdev/5e25dc995d7d_74082aaee6e465b441@john-XPS-13-9370.notmuch/
      
      Fixes: 7e81a353 ("bpf: Sockmap, ensure sock lock held during tear down")
      Reported-by: syzbot+d73682fcf7fee6982fe3@syzkaller.appspotmail.com
      Suggested-by: NJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: NJakub Sitnicki <jakub@cloudflare.com>
      Acked-by: NJohn Fastabend <john.fastabend@gmail.com>
      Acked-by: NDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      58c8db92
  2. 21 1月, 2020 7 次提交
    • W
      net, ip_tunnel: fix namespaces move · d0f41851
      William Dauchy 提交于
      in the same manner as commit 690afc16 ("net: ip6_gre: fix moving
      ip6gre between namespaces"), fix namespace moving as it was broken since
      commit 2e15ea39 ("ip_gre: Add support to collect tunnel metadata.").
      Indeed, the ip6_gre commit removed the local flag for collect_md
      condition, so there is no reason to keep it for ip_gre/ip_tunnel.
      
      this patch will fix both ip_tunnel and ip_gre modules.
      
      Fixes: 2e15ea39 ("ip_gre: Add support to collect tunnel metadata.")
      Signed-off-by: NWilliam Dauchy <w.dauchy@criteo.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      d0f41851
    • T
      tcp: remove redundant assigment to snd_cwnd · bfe02b9f
      Theodore Dubois 提交于
      Not sure how this got in here. git blame says the second assignment was
      added in 3a9a57f6, but that commit also removed the first assignment.
      Signed-off-by: NTheodore Dubois <tblodt@icloud.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      bfe02b9f
    • J
      net: usb: lan78xx: Add .ndo_features_check · ce896476
      James Hughes 提交于
      As reported by Eric Dumazet, there are still some outstanding
      cases where the driver does not handle TSO correctly when skb's
      are over a certain size. Most cases have been fixed, this patch
      should ensure that forwarded SKB's that are greater than
      MAX_SINGLE_PACKET_SIZE - TX_OVERHEAD are software segmented
      and handled correctly.
      Signed-off-by: NJames Hughes <james.hughes@raspberrypi.org>
      Reviewed-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ce896476
    • W
      tcp_bbr: improve arithmetic division in bbr_update_bw() · 5b2f1f30
      Wen Yang 提交于
      do_div() does a 64-by-32 division. Use div64_long() instead of it
      if the divisor is long, to avoid truncation to 32-bit.
      And as a nice side effect also cleans up the function a bit.
      Signed-off-by: NWen Yang <wenyang@linux.alibaba.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5b2f1f30
    • J
      net-sysfs: Fix reference count leak · cb626bf5
      Jouni Hogander 提交于
      Netdev_register_kobject is calling device_initialize. In case of error
      reference taken by device_initialize is not given up.
      
      Drivers are supposed to call free_netdev in case of error. In non-error
      case the last reference is given up there and device release sequence
      is triggered. In error case this reference is kept and the release
      sequence is never started.
      
      Fix this by setting reg_state as NETREG_UNREGISTERED if registering
      fails.
      
      This is the rootcause for couple of memory leaks reported by Syzkaller:
      
      BUG: memory leak unreferenced object 0xffff8880675ca008 (size 256):
        comm "netdev_register", pid 281, jiffies 4294696663 (age 6.808s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
        backtrace:
          [<0000000058ca4711>] kmem_cache_alloc_trace+0x167/0x280
          [<000000002340019b>] device_add+0x882/0x1750
          [<000000001d588c3a>] netdev_register_kobject+0x128/0x380
          [<0000000011ef5535>] register_netdevice+0xa1b/0xf00
          [<000000007fcf1c99>] __tun_chr_ioctl+0x20d5/0x3dd0
          [<000000006a5b7b2b>] tun_chr_ioctl+0x2f/0x40
          [<00000000f30f834a>] do_vfs_ioctl+0x1c7/0x1510
          [<00000000fba062ea>] ksys_ioctl+0x99/0xb0
          [<00000000b1c1b8d2>] __x64_sys_ioctl+0x78/0xb0
          [<00000000984cabb9>] do_syscall_64+0x16f/0x580
          [<000000000bde033d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
          [<00000000e6ca2d9f>] 0xffffffffffffffff
      
      BUG: memory leak
      unreferenced object 0xffff8880668ba588 (size 8):
        comm "kobject_set_nam", pid 286, jiffies 4294725297 (age 9.871s)
        hex dump (first 8 bytes):
          6e 72 30 00 cc be df 2b                          nr0....+
        backtrace:
          [<00000000a322332a>] __kmalloc_track_caller+0x16e/0x290
          [<00000000236fd26b>] kstrdup+0x3e/0x70
          [<00000000dd4a2815>] kstrdup_const+0x3e/0x50
          [<0000000049a377fc>] kvasprintf_const+0x10e/0x160
          [<00000000627fc711>] kobject_set_name_vargs+0x5b/0x140
          [<0000000019eeab06>] dev_set_name+0xc0/0xf0
          [<0000000069cb12bc>] netdev_register_kobject+0xc8/0x320
          [<00000000f2e83732>] register_netdevice+0xa1b/0xf00
          [<000000009e1f57cc>] __tun_chr_ioctl+0x20d5/0x3dd0
          [<000000009c560784>] tun_chr_ioctl+0x2f/0x40
          [<000000000d759e02>] do_vfs_ioctl+0x1c7/0x1510
          [<00000000351d7c31>] ksys_ioctl+0x99/0xb0
          [<000000008390040a>] __x64_sys_ioctl+0x78/0xb0
          [<0000000052d196b7>] do_syscall_64+0x16f/0x580
          [<0000000019af9236>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
          [<00000000bc384531>] 0xffffffffffffffff
      
      v3 -> v4:
        Set reg_state to NETREG_UNREGISTERED if registering fails
      
      v2 -> v3:
      * Replaced BUG_ON with WARN_ON in free_netdev and netdev_release
      
      v1 -> v2:
      * Relying on driver calling free_netdev rather than calling
        put_device directly in error path
      
      Reported-by: syzbot+ad8ca40ecd77896d51e2@syzkaller.appspotmail.com
      Cc: David Miller <davem@davemloft.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
      Signed-off-by: NJouni Hogander <jouni.hogander@unikie.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      cb626bf5
    • Y
      ipv6: sr: remove SKB_GSO_IPXIP6 on End.D* actions · 62ebaeae
      Yuki Taguchi 提交于
      After LRO/GRO is applied, SRv6 encapsulated packets have
      SKB_GSO_IPXIP6 feature flag, and this flag must be removed right after
      decapulation procedure.
      
      Currently, SKB_GSO_IPXIP6 flag is not removed on End.D* actions, which
      creates inconsistent packet state, that is, a normal TCP/IP packets
      have the SKB_GSO_IPXIP6 flag. This behavior can cause unexpected
      fallback to GSO on routing to netdevices that do not support
      SKB_GSO_IPXIP6. For example, on inter-VRF forwarding, decapsulated
      packets separated into small packets by GSO because VRF devices do not
      support TSO for packets with SKB_GSO_IPXIP6 flag, and this degrades
      forwarding performance.
      
      This patch removes encapsulation related GSO flags from the skb right
      after the End.D* action is applied.
      
      Fixes: d7a669dd ("ipv6: sr: add helper functions for seg6local")
      Signed-off-by: NYuki Taguchi <tagyounit@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      62ebaeae
    • D
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 9c5ed2f8
      David S. Miller 提交于
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2020-01-21
      
      1) Fix packet tx through bpf_redirect() for xfrm and vti
         interfaces. From Nicolas Dichtel.
      
      2) Do not confirm neighbor when do pmtu update on a virtual
         xfrm interface. From Xu Wang.
      
      3) Support output_mark for offload ESP packets, this was
         forgotten when the output_mark was added initially.
         From Ulrich Weber.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      9c5ed2f8
  3. 20 1月, 2020 5 次提交
    • X
      hsr: Fix a compilation error · 80892772
      xiaofeng.yan 提交于
      A compliation error happen when building branch 5.5-rc7
      
      In file included from net/hsr/hsr_main.c:12:0:
      net/hsr/hsr_main.h:194:20: error: two or more data types in declaration specifiers
       static inline void void hsr_debugfs_rename(struct net_device *dev)
      
      So Removed one void.
      
      Fixes: 4c2d5e33 ("hsr: rename debugfs file when interface name is changed")
      Signed-off-by: Nxiaofeng.yan <yanxiaofeng7@jd.com>
      Acked-by: NTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      80892772
    • N
      net: ip6_gre: fix moving ip6gre between namespaces · 690afc16
      Niko Kortstrom 提交于
      Support for moving IPv4 GRE tunnels between namespaces was added in
      commit b57708ad ("gre: add x-netns support"). The respective change
      for IPv6 tunnels, commit 22f08069 ("ip6gre: add x-netns support")
      did not drop NETIF_F_NETNS_LOCAL flag so moving them from one netns to
      another is still denied in IPv6 case. Drop NETIF_F_NETNS_LOCAL flag from
      ip6gre tunnels to allow moving ip6gre tunnel endpoints between network
      namespaces.
      Signed-off-by: NNiko Kortstrom <niko.kortstrom@nokia.com>
      Acked-by: NNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: NWilliam Tu <u9012063@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      690afc16
    • L
      Merge tag 'riscv/for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 7008ee12
      Linus Torvalds 提交于
      Pull RISC-V fixes from Paul Walmsley:
       "Three fixes for RISC-V:
      
         - Don't free and reuse memory containing the code that CPUs parked at
           boot reside in.
      
         - Fix rv64 build problems for ubsan and some modules by adding
           logical and arithmetic shift helpers for 128-bit values. These are
           from libgcc and are similar to what's present for ARM64.
      
         - Fix vDSO builds to clean up their own temporary files"
      
      * tag 'riscv/for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: Less inefficient gcc tishift helpers (and export their symbols)
        riscv: delete temporary files
        riscv: make sure the cores stay looping in .Lsecondary_park
      7008ee12
    • L
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 11a82729
      Linus Torvalds 提交于
      Pull networking fixes from David Miller:
      
       1) Fix non-blocking connect() in x25, from Martin Schiller.
      
       2) Fix spurious decryption errors in kTLS, from Jakub Kicinski.
      
       3) Netfilter use-after-free in mtype_destroy(), from Cong Wang.
      
       4) Limit size of TSO packets properly in lan78xx driver, from Eric
          Dumazet.
      
       5) r8152 probe needs an endpoint sanity check, from Johan Hovold.
      
       6) Prevent looping in tcp_bpf_unhash() during sockmap/tls free, from
          John Fastabend.
      
       7) hns3 needs short frames padded on transmit, from Yunsheng Lin.
      
       8) Fix netfilter ICMP header corruption, from Eyal Birger.
      
       9) Fix soft lockup when low on memory in hns3, from Yonglong Liu.
      
      10) Fix NTUPLE firmware command failures in bnxt_en, from Michael Chan.
      
      11) Fix memory leak in act_ctinfo, from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
        cxgb4: reject overlapped queues in TC-MQPRIO offload
        cxgb4: fix Tx multi channel port rate limit
        net: sched: act_ctinfo: fix memory leak
        bnxt_en: Do not treat DSN (Digital Serial Number) read failure as fatal.
        bnxt_en: Fix ipv6 RFS filter matching logic.
        bnxt_en: Fix NTUPLE firmware command failures.
        net: systemport: Fixed queue mapping in internal ring map
        net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec
        net: dsa: sja1105: Don't error out on disabled ports with no phy-mode
        net: phy: dp83867: Set FORCE_LINK_GOOD to default after reset
        net: hns: fix soft lockup when there is not enough memory
        net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key()
        net/sched: act_ife: initalize ife->metalist earlier
        netfilter: nat: fix ICMP header corruption on ICMP errors
        net: wan: lapbether.c: Use built-in RCU list checking
        netfilter: nf_tables: fix flowtable list del corruption
        netfilter: nf_tables: fix memory leak in nf_tables_parse_netdev_hooks()
        netfilter: nf_tables: remove WARN and add NLA_STRING upper limits
        netfilter: nft_tunnel: ERSPAN_VERSION must not be null
        netfilter: nft_tunnel: fix null-attribute check
        ...
      11a82729
    • L
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 5f436443
      Linus Torvalds 提交于
      Pull i2c fixes from Wolfram Sang:
       "Two runtime PM fixes and one leak fix"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: iop3xx: Fix memory leak in probe error path
        i2c: tegra: Properly disable runtime PM on driver's probe error
        i2c: tegra: Fix suspending in active runtime PM state
      5f436443
  4. 19 1月, 2020 18 次提交
    • R
      cxgb4: reject overlapped queues in TC-MQPRIO offload · b2383ad9
      Rahul Lakkireddy 提交于
      A queue can't belong to multiple traffic classes. So, reject
      any such configuration that results in overlapped queues for a
      traffic class.
      
      Fixes: b1396c2b ("cxgb4: parse and configure TC-MQPRIO offload")
      Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      b2383ad9
    • R
      cxgb4: fix Tx multi channel port rate limit · c856e2b6
      Rahul Lakkireddy 提交于
      T6 can support 2 egress traffic management channels per port to
      double the total number of traffic classes that can be configured.
      In this configuration, if the class belongs to the other channel,
      then all the queues must be bound again explicitly to the new class,
      for the rate limit parameters on the other channel to take effect.
      
      So, always explicitly bind all queues to the port rate limit traffic
      class, regardless of the traffic management channel that it belongs
      to. Also, only bind queues to port rate limit traffic class, if all
      the queues don't already belong to an existing different traffic
      class.
      
      Fixes: 4ec4762d ("cxgb4: add TC-MATCHALL classifier egress offload")
      Signed-off-by: NRahul Lakkireddy <rahul.lakkireddy@chelsio.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c856e2b6
    • E
      net: sched: act_ctinfo: fix memory leak · 09d4f10a
      Eric Dumazet 提交于
      Implement a cleanup method to properly free ci->params
      
      BUG: memory leak
      unreferenced object 0xffff88811746e2c0 (size 64):
        comm "syz-executor617", pid 7106, jiffies 4294943055 (age 14.250s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          c0 34 60 84 ff ff ff ff 00 00 00 00 00 00 00 00  .4`.............
        backtrace:
          [<0000000015aa236f>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
          [<0000000015aa236f>] slab_post_alloc_hook mm/slab.h:586 [inline]
          [<0000000015aa236f>] slab_alloc mm/slab.c:3320 [inline]
          [<0000000015aa236f>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3549
          [<000000002c946bd1>] kmalloc include/linux/slab.h:556 [inline]
          [<000000002c946bd1>] kzalloc include/linux/slab.h:670 [inline]
          [<000000002c946bd1>] tcf_ctinfo_init+0x21a/0x530 net/sched/act_ctinfo.c:236
          [<0000000086952cca>] tcf_action_init_1+0x400/0x5b0 net/sched/act_api.c:944
          [<000000005ab29bf8>] tcf_action_init+0x135/0x1c0 net/sched/act_api.c:1000
          [<00000000392f56f9>] tcf_action_add+0x9a/0x200 net/sched/act_api.c:1410
          [<0000000088f3c5dd>] tc_ctl_action+0x14d/0x1bb net/sched/act_api.c:1465
          [<000000006b39d986>] rtnetlink_rcv_msg+0x178/0x4b0 net/core/rtnetlink.c:5424
          [<00000000fd6ecace>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
          [<0000000047493d02>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5442
          [<00000000bdcf8286>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
          [<00000000bdcf8286>] netlink_unicast+0x223/0x310 net/netlink/af_netlink.c:1328
          [<00000000fc5b92d9>] netlink_sendmsg+0x2c0/0x570 net/netlink/af_netlink.c:1917
          [<00000000da84d076>] sock_sendmsg_nosec net/socket.c:639 [inline]
          [<00000000da84d076>] sock_sendmsg+0x54/0x70 net/socket.c:659
          [<0000000042fb2eee>] ____sys_sendmsg+0x2d0/0x300 net/socket.c:2330
          [<000000008f23f67e>] ___sys_sendmsg+0x8a/0xd0 net/socket.c:2384
          [<00000000d838e4f6>] __sys_sendmsg+0x80/0xf0 net/socket.c:2417
          [<00000000289a9cb1>] __do_sys_sendmsg net/socket.c:2426 [inline]
          [<00000000289a9cb1>] __se_sys_sendmsg net/socket.c:2424 [inline]
          [<00000000289a9cb1>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2424
      
      Fixes: 24ec483c ("net: sched: Introduce act_ctinfo action")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Kevin 'ldir' Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Toke Høiland-Jørgensen <toke@redhat.com>
      Acked-by: NKevin 'ldir' Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      09d4f10a
    • O
      riscv: Less inefficient gcc tishift helpers (and export their symbols) · fc585d4a
      Olof Johansson 提交于
      The existing __lshrti3 was really inefficient, and the other two helpers
      are also needed to compile some modules.
      
      Add the missing versions, and export all of the symbols like arm64
      already does.
      
      This code is based on the assembly generated by libgcc builds.
      
      This fixes a build break triggered by ubsan:
      
      riscv64-unknown-linux-gnu-ld: lib/ubsan.o: in function `.L2':
      ubsan.c:(.text.unlikely+0x38): undefined reference to `__ashlti3'
      riscv64-unknown-linux-gnu-ld: ubsan.c:(.text.unlikely+0x42): undefined reference to `__ashrti3'
      Signed-off-by: NOlof Johansson <olof@lixom.net>
      [paul.walmsley@sifive.com: use SYM_FUNC_{START,END} instead of
       ENTRY/ENDPROC; note libgcc origin]
      Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>
      fc585d4a
    • L
      Merge tag 'mtd/fixes-for-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · 8f8972a3
      Linus Torvalds 提交于
      Pull MTD fixes from Miquel Raynal:
       "Raw NAND:
         - GPMI: Fix the suspend/resume
      
        SPI-NOR:
         - Fix quad enable on Spansion like flashes
         - Fix selection of 4-byte addressing opcodes on Spansion"
      
      * tag 'mtd/fixes-for-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
        mtd: rawnand: gpmi: Restore nfc timing setup after suspend/resume
        mtd: rawnand: gpmi: Fix suspend/resume problem
        mtd: spi-nor: Fix quad enable for Spansion like flashes
        mtd: spi-nor: Fix selection of 4-byte addressing opcodes on Spansion
      8f8972a3
    • L
      Merge tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drm · 244dc268
      Linus Torvalds 提交于
      Pull drm fixes from Dave Airlie:
       "Back from LCA2020, fixes wasn't too busy last week, seems to have
        quieten down appropriately, some amdgpu, i915, then a core mst fix and
        one fix for virtio-gpu and one for rockchip:
      
        core mst:
         - serialize down messages and clear timeslots are on unplug
      
        amdgpu:
         - Update golden settings for renoir
         - eDP fix
      
        i915:
         - uAPI fix: Remove dash and colon from PMU names to comply with
           tools/perf
         - Fix for include file that was indirectly included
         - Two fixes to make sure VMA are marked active for error capture
      
        virtio:
         - maintain obj reservation lock when submitting cmds
      
        rockchip:
         - increase link rate var size to accommodate rates"
      
      * tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drm:
        drm/amd/display: Reorder detect_edp_sink_caps before link settings read.
        drm/amdgpu: update goldensetting for renoir
        drm/dp_mst: Have DP_Tx send one msg at a time
        drm/dp_mst: clear time slots for ports invalid
        drm/i915/pmu: Do not use colons or dashes in PMU names
        drm/rockchip: fix integer type used for storing dp data rate
        drm/i915/gt: Mark ring->vma as active while pinned
        drm/i915/gt: Mark context->state vma as active while pinned
        drm/i915/gt: Skip trying to unbind in restore_ggtt_mappings
        drm/i915: Add missing include file <linux/math64.h>
        drm/virtio: add missing virtio_gpu_array_lock_resv call
      244dc268
    • I
      riscv: delete temporary files · 95f4d9cc
      Ilie Halip 提交于
      Temporary files used in the VDSO build process linger on even after make
      mrproper: vdso-dummy.o.tmp, vdso.so.dbg.tmp.
      
      Delete them once they're no longer needed.
      Signed-off-by: NIlie Halip <ilie.halip@gmail.com>
      Signed-off-by: NPaul Walmsley <paul.walmsley@sifive.com>
      95f4d9cc
    • L
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0cc2682d
      Linus Torvalds 提交于
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes:
      
         - a resctrl fix for uninitialized objects found by debugobjects
      
         - a resctrl memory leak fix
      
         - fix the unintended re-enabling of the of SME and SEV CPU flags if
           memory encryption was disabled at bootup via the MSR space"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/CPU/AMD: Ensure clearing of SME/SEV features is maintained
        x86/resctrl: Fix potential memory leak
        x86/resctrl: Fix an imbalance in domain_remove_cpu()
      0cc2682d
    • L
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7ff15cd0
      Linus Torvalds 提交于
      Pull timer fixes from Ingo Molnar:
       "Three fixes: fix link failure on Alpha, fix a Sparse warning and
        annotate/robustify a lockless access in the NOHZ code"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tick/sched: Annotate lockless access to last_jiffies_update
        lib/vdso: Make __cvdso_clock_getres() static
        time/posix-stubs: Provide compat itimer supoprt for alpha
      7ff15cd0
    • L
      Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9e79c523
      Linus Torvalds 提交于
      Pull cpu/SMT fix from Ingo Molnar:
       "Fix a build bug on CONFIG_HOTPLUG_SMT=y && !CONFIG_SYSFS kernels"
      
      * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        cpu/SMT: Fix x86 link error without CONFIG_SYSFS
      9e79c523
    • L
      Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a186c112
      Linus Torvalds 提交于
      Pull x86 RAS fix from Ingo Molnar:
       "Fix a thermal throttling race that can result in easy to trigger boot
        crashes on certain Ice Lake platforms"
      
      * 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce/therm_throt: Do not access uninitialized therm_work
      a186c112
    • L
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b07b9e8d
      Linus Torvalds 提交于
      Pull perf fixes from Ingo Molnar:
       "Tooling fixes, three Intel uncore driver fixes, plus an AUX events fix
        uncovered by the perf fuzzer"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel/uncore: Remove PCIe3 unit for SNR
        perf/x86/intel/uncore: Fix missing marker for snr_uncore_imc_freerunning_events
        perf/x86/intel/uncore: Add PCI ID of IMC for Xeon E3 V5 Family
        perf: Correctly handle failed perf_get_aux_event()
        perf hists: Fix variable name's inconsistency in hists__for_each() macro
        perf map: Set kmap->kmaps backpointer for main kernel map chunks
        perf report: Fix incorrectly added dimensions as switch perf data file
        tools lib traceevent: Fix memory leakage in filter_event
      b07b9e8d
    • L
      Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 124b5547
      Linus Torvalds 提交于
      Pull locking fixes from Ingo Molnar:
       "Three fixes:
      
          - Fix an rwsem spin-on-owner crash, introduced in v5.4
      
          - Fix a lockdep bug when running out of stack_trace entries,
            introduced in v5.4
      
          - Docbook fix"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/rwsem: Fix kernel crash when spinning on RWSEM_OWNER_UNKNOWN
        futex: Fix kernel-doc notation warning
        locking/lockdep: Fix buffer overrun problem in stack_trace[]
      124b5547
    • L
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a1c6f87e
      Linus Torvalds 提交于
      Pull irq fix from Ingo Molnar:
       "Fix a recent regression in the Ingenic SoCs irqchip driver that floods
        the syslog"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/ingenic: Get rid of the legacy IRQ domain
      a1c6f87e
    • L
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e2f73d1e
      Linus Torvalds 提交于
      Pull EFI fixes from Ingo Molnar:
       "Three EFI fixes:
      
         - Fix a slow-boot-scrolling regression but making sure we use WC for
           EFI earlycon framebuffer mappings on x86
      
         - Fix a mixed EFI mode boot crash
      
         - Disable paging explicitly before entering startup_32() in mixed
           mode bootup"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/efistub: Disable paging at mixed mode entry
        efi/libstub/random: Initialize pointer variables to zero for mixed mode
        efi/earlycon: Fix write-combine mapping on x86
      e2f73d1e
    • L
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ba0f4722
      Linus Torvalds 提交于
      Pull rseq fixes from Ingo Molnar:
       "Two rseq bugfixes:
      
         - CLONE_VM !CLONE_THREAD didn't work properly, the kernel would end
           up corrupting the TLS of the parent. Technically a change in the
           ABI but the previous behavior couldn't resonably have been relied
           on by applications so this looks like a valid exception to the ABI
           rule.
      
         - Make the RSEQ_FLAG_UNREGISTER ABI behavior consistent with the
           handling of other flags. This is not thought to impact any
           applications either"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        rseq: Unregister rseq for clone CLONE_VM
        rseq: Reject unknown flags on rseq unregister
      ba0f4722
    • L
      Merge tag 'for-linus-2020-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 8cac8990
      Linus Torvalds 提交于
      Pull thread fixes from Christian Brauner:
       "Here is an urgent fix for ptrace_may_access() permission checking.
      
        Commit 69f594a3 ("ptrace: do not audit capability check when
        outputing /proc/pid/stat") introduced the ability to opt out of audit
        messages for accesses to various proc files since they are not
        violations of policy.
      
        While doing so it switched the check from ns_capable() to
        has_ns_capability{_noaudit}(). That means it switched from checking
        the subjective credentials (ktask->cred) of the task to using the
        objective credentials (ktask->real_cred). This is appears to be wrong.
        ptrace_has_cap() is currently only used in ptrace_may_access() And is
        used to check whether the calling task (subject) has the
        CAP_SYS_PTRACE capability in the provided user namespace to operate on
        the target task (object). According to the cred.h comments this means
        the subjective credentials of the calling task need to be used.
      
        With this fix we switch ptrace_has_cap() to use security_capable() and
        thus back to using the subjective credentials.
      
        As one example where this might be particularly problematic, Jann
        pointed out that in combination with the upcoming IORING_OP_OPENAT{2}
        feature, this bug might allow unprivileged users to bypass the
        capability checks while asynchronously opening files like /proc/*/mem,
        because the capability checks for this would be performed against
        kernel credentials.
      
        To illustrate on the former point about this being exploitable: When
        io_uring creates a new context it records the subjective credentials
        of the caller. Later on, when it starts to do work it creates a kernel
        thread and registers a callback. The callback runs with kernel creds
        for ktask->real_cred and ktask->cred.
      
        To prevent this from becoming a full-blown 0-day io_uring will call
        override_cred() and override ktask->cred with the subjective
        credentials of the creator of the io_uring instance. With
        ptrace_has_cap() currently looking at ktask->real_cred this override
        will be ineffective and the caller will be able to open arbitray proc
        files as mentioned above.
      
        Luckily, this is currently not exploitable but would be so once
        IORING_OP_OPENAT{2} land in v5.6. Let's fix it now.
      
        To minimize potential regressions I successfully ran the criu
        testsuite. criu makes heavy use of ptrace() and extensively hits
        ptrace_may_access() codepaths and has a good change of detecting any
        regressions.
      
        Additionally, I succesfully ran the ptrace and seccomp kernel tests"
      
      * tag 'for-linus-2020-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        ptrace: reintroduce usage of subjective credentials in ptrace_has_cap()
      8cac8990
    • L
      Merge tag 's390-5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 2324de6f
      Linus Torvalds 提交于
      Pull s390 fixes from Vasily Gorbik:
      
       - Fix printing misleading Secure-IPL enabled message when it is not.
      
       - Fix a race condition between host ap bus and guest ap bus doing
         device reset in crypto code.
      
       - Fix sanity check in CCA cipher key function (CCA AES cipher key
         support), which fails otherwise.
      
      * tag 's390-5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/setup: Fix secure ipl message
        s390/zcrypt: move ap device reset from bus to driver code
        s390/zcrypt: Fix CCA cipher key gen with clear key value function
      2324de6f