1. 23 1月, 2019 36 次提交
    • E
      crypto: authenc - fix parsing key with misaligned rta_len · 44c67402
      Eric Biggers 提交于
      commit 8f9c469348487844328e162db57112f7d347c49f upstream.
      
      Keys for "authenc" AEADs are formatted as an rtattr containing a 4-byte
      'enckeylen', followed by an authentication key and an encryption key.
      crypto_authenc_extractkeys() parses the key to find the inner keys.
      
      However, it fails to consider the case where the rtattr's payload is
      longer than 4 bytes but not 4-byte aligned, and where the key ends
      before the next 4-byte aligned boundary.  In this case, 'keylen -=
      RTA_ALIGN(rta->rta_len);' underflows to a value near UINT_MAX.  This
      causes a buffer overread and crash during crypto_ahash_setkey().
      
      Fix it by restricting the rtattr payload to the expected size.
      
      Reproducer using AF_ALG:
      
      	#include <linux/if_alg.h>
      	#include <linux/rtnetlink.h>
      	#include <sys/socket.h>
      
      	int main()
      	{
      		int fd;
      		struct sockaddr_alg addr = {
      			.salg_type = "aead",
      			.salg_name = "authenc(hmac(sha256),cbc(aes))",
      		};
      		struct {
      			struct rtattr attr;
      			__be32 enckeylen;
      			char keys[1];
      		} __attribute__((packed)) key = {
      			.attr.rta_len = sizeof(key),
      			.attr.rta_type = 1 /* CRYPTO_AUTHENC_KEYA_PARAM */,
      		};
      
      		fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
      		bind(fd, (void *)&addr, sizeof(addr));
      		setsockopt(fd, SOL_ALG, ALG_SET_KEY, &key, sizeof(key));
      	}
      
      It caused:
      
      	BUG: unable to handle kernel paging request at ffff88007ffdc000
      	PGD 2e01067 P4D 2e01067 PUD 2e04067 PMD 2e05067 PTE 0
      	Oops: 0000 [#1] SMP
      	CPU: 0 PID: 883 Comm: authenc Not tainted 4.20.0-rc1-00108-g00c9fe37a7f27 #13
      	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
      	RIP: 0010:sha256_ni_transform+0xb3/0x330 arch/x86/crypto/sha256_ni_asm.S:155
      	[...]
      	Call Trace:
      	 sha256_ni_finup+0x10/0x20 arch/x86/crypto/sha256_ssse3_glue.c:321
      	 crypto_shash_finup+0x1a/0x30 crypto/shash.c:178
      	 shash_digest_unaligned+0x45/0x60 crypto/shash.c:186
      	 crypto_shash_digest+0x24/0x40 crypto/shash.c:202
      	 hmac_setkey+0x135/0x1e0 crypto/hmac.c:66
      	 crypto_shash_setkey+0x2b/0xb0 crypto/shash.c:66
      	 shash_async_setkey+0x10/0x20 crypto/shash.c:223
      	 crypto_ahash_setkey+0x2d/0xa0 crypto/ahash.c:202
      	 crypto_authenc_setkey+0x68/0x100 crypto/authenc.c:96
      	 crypto_aead_setkey+0x2a/0xc0 crypto/aead.c:62
      	 aead_setkey+0xc/0x10 crypto/algif_aead.c:526
      	 alg_setkey crypto/af_alg.c:223 [inline]
      	 alg_setsockopt+0xfe/0x130 crypto/af_alg.c:256
      	 __sys_setsockopt+0x6d/0xd0 net/socket.c:1902
      	 __do_sys_setsockopt net/socket.c:1913 [inline]
      	 __se_sys_setsockopt net/socket.c:1910 [inline]
      	 __x64_sys_setsockopt+0x1f/0x30 net/socket.c:1910
      	 do_syscall_64+0x4a/0x180 arch/x86/entry/common.c:290
      	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: e236d4a8 ("[CRYPTO] authenc: Move enckeylen into key itself")
      Cc: <stable@vger.kernel.org> # v2.6.25+
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44c67402
    • E
      crypto: bcm - convert to use crypto_authenc_extractkeys() · 97a6662b
      Eric Biggers 提交于
      commit ab57b33525c3221afaebd391458fa0cbcd56903d upstream.
      
      Convert the bcm crypto driver to use crypto_authenc_extractkeys() so
      that it picks up the fix for broken validation of rtattr::rta_len.
      
      This also fixes the DES weak key check to actually be done on the right
      key. (It was checking the authentication key, not the encryption key...)
      
      Fixes: 9d12ba86 ("crypto: brcm - Add Broadcom SPU driver")
      Cc: <stable@vger.kernel.org> # v4.11+
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97a6662b
    • E
      crypto: ccree - convert to use crypto_authenc_extractkeys() · 93242fa0
      Eric Biggers 提交于
      commit dc95b5350a8f07d73d6bde3a79ef87289698451d upstream.
      
      Convert the ccree crypto driver to use crypto_authenc_extractkeys() so
      that it picks up the fix for broken validation of rtattr::rta_len.
      
      Fixes: ff27e85a ("crypto: ccree - add AEAD support")
      Cc: <stable@vger.kernel.org> # v4.17+
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93242fa0
    • H
      crypto: authencesn - Avoid twice completion call in decrypt path · 65908037
      Harsh Jain 提交于
      commit a7773363624b034ab198c738661253d20a8055c2 upstream.
      
      Authencesn template in decrypt path unconditionally calls aead_request_complete
      after ahash_verify which leads to following kernel panic in after decryption.
      
      [  338.539800] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
      [  338.548372] PGD 0 P4D 0
      [  338.551157] Oops: 0000 [#1] SMP PTI
      [  338.554919] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: G        W I       4.19.7+ #13
      [  338.564431] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0        07/29/10
      [  338.572212] RIP: 0010:esp_input_done2+0x350/0x410 [esp4]
      [  338.578030] Code: ff 0f b6 68 10 48 8b 83 c8 00 00 00 e9 8e fe ff ff 8b 04 25 04 00 00 00 83 e8 01 48 98 48 8b 3c c5 10 00 00 00 e9 f7 fd ff ff <8b> 04 25 04 00 00 00 83 e8 01 48 98 4c 8b 24 c5 10 00 00 00 e9 3b
      [  338.598547] RSP: 0018:ffff911c97803c00 EFLAGS: 00010246
      [  338.604268] RAX: 0000000000000002 RBX: ffff911c4469ee00 RCX: 0000000000000000
      [  338.612090] RDX: 0000000000000000 RSI: 0000000000000130 RDI: ffff911b87c20400
      [  338.619874] RBP: 0000000000000000 R08: ffff911b87c20498 R09: 000000000000000a
      [  338.627610] R10: 0000000000000001 R11: 0000000000000004 R12: 0000000000000000
      [  338.635402] R13: ffff911c89590000 R14: ffff911c91730000 R15: 0000000000000000
      [  338.643234] FS:  0000000000000000(0000) GS:ffff911c97800000(0000) knlGS:0000000000000000
      [  338.652047] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  338.658299] CR2: 0000000000000004 CR3: 00000001ec20a000 CR4: 00000000000006f0
      [  338.666382] Call Trace:
      [  338.669051]  <IRQ>
      [  338.671254]  esp_input_done+0x12/0x20 [esp4]
      [  338.675922]  chcr_handle_resp+0x3b5/0x790 [chcr]
      [  338.680949]  cpl_fw6_pld_handler+0x37/0x60 [chcr]
      [  338.686080]  chcr_uld_rx_handler+0x22/0x50 [chcr]
      [  338.691233]  uldrx_handler+0x8c/0xc0 [cxgb4]
      [  338.695923]  process_responses+0x2f0/0x5d0 [cxgb4]
      [  338.701177]  ? bitmap_find_next_zero_area_off+0x3a/0x90
      [  338.706882]  ? matrix_alloc_area.constprop.7+0x60/0x90
      [  338.712517]  ? apic_update_irq_cfg+0x82/0xf0
      [  338.717177]  napi_rx_handler+0x14/0xe0 [cxgb4]
      [  338.722015]  net_rx_action+0x2aa/0x3e0
      [  338.726136]  __do_softirq+0xcb/0x280
      [  338.730054]  irq_exit+0xde/0xf0
      [  338.733504]  do_IRQ+0x54/0xd0
      [  338.736745]  common_interrupt+0xf/0xf
      
      Fixes: 104880a6 ("crypto: authencesn - Convert to new AEAD...")
      Signed-off-by: NHarsh Jain <harsh@chelsio.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      65908037
    • A
      crypto: caam - fix zero-length buffer DMA mapping · 9107b2f4
      Aymen Sghaier 提交于
      commit 04e6d25c5bb244c1a37eb9fe0b604cc11a04e8c5 upstream.
      
      Recent changes - probably DMA API related (generic and/or arm64-specific) -
      exposed a case where driver maps a zero-length buffer:
      ahash_init()->ahash_update()->ahash_final() with a zero-length string to
      hash
      
      kernel BUG at kernel/dma/swiotlb.c:475!
      Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 2 PID: 1823 Comm: cryptomgr_test Not tainted 4.20.0-rc1-00108-g00c9fe37a7f2 #1
      Hardware name: LS1046A RDB Board (DT)
      pstate: 80000005 (Nzcv daif -PAN -UAO)
      pc : swiotlb_tbl_map_single+0x170/0x2b8
      lr : swiotlb_map_page+0x134/0x1f8
      sp : ffff00000f79b8f0
      x29: ffff00000f79b8f0 x28: 0000000000000000
      x27: ffff0000093d0000 x26: 0000000000000000
      x25: 00000000001f3ffe x24: 0000000000200000
      x23: 0000000000000000 x22: 00000009f2c538c0
      x21: ffff800970aeb410 x20: 0000000000000001
      x19: ffff800970aeb410 x18: 0000000000000007
      x17: 000000000000000e x16: 0000000000000001
      x15: 0000000000000019 x14: c32cb8218a167fe8
      x13: ffffffff00000000 x12: ffff80097fdae348
      x11: 0000800976bca000 x10: 0000000000000010
      x9 : 0000000000000000 x8 : ffff0000091fd6c8
      x7 : 0000000000000000 x6 : 00000009f2c538bf
      x5 : 0000000000000000 x4 : 0000000000000001
      x3 : 0000000000000000 x2 : 00000009f2c538c0
      x1 : 00000000f9fff000 x0 : 0000000000000000
      Process cryptomgr_test (pid: 1823, stack limit = 0x(____ptrval____))
      Call trace:
       swiotlb_tbl_map_single+0x170/0x2b8
       swiotlb_map_page+0x134/0x1f8
       ahash_final_no_ctx+0xc4/0x6cc
       ahash_final+0x10/0x18
       crypto_ahash_op+0x30/0x84
       crypto_ahash_final+0x14/0x1c
       __test_hash+0x574/0xe0c
       test_hash+0x28/0x80
       __alg_test_hash+0x84/0xd0
       alg_test_hash+0x78/0x144
       alg_test.part.30+0x12c/0x2b4
       alg_test+0x3c/0x68
       cryptomgr_test+0x44/0x4c
       kthread+0xfc/0x128
       ret_from_fork+0x10/0x18
      Code: d34bfc18 2a1a03f7 1a9f8694 35fff89a (d4210000)
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NAymen Sghaier <aymen.sghaier@nxp.com>
      Signed-off-by: NHoria Geantă <horia.geanta@nxp.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9107b2f4
    • E
      crypto: sm3 - fix undefined shift by >= width of value · 68afc7c3
      Eric Biggers 提交于
      commit d45a90cb5d061fa7d411b974b950fe0b8bc5f265 upstream.
      
      sm3_compress() calls rol32() with shift >= 32, which causes undefined
      behavior.  This is easily detected by enabling CONFIG_UBSAN.
      
      Explicitly AND with 31 to make the behavior well defined.
      
      Fixes: 4f0fc160 ("crypto: sm3 - add OSCCA SM3 secure hash")
      Cc: <stable@vger.kernel.org> # v4.15+
      Cc: Gilad Ben-Yossef <gilad@benyossef.com>
      Signed-off-by: NEric Biggers <ebiggers@google.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      68afc7c3
    • H
      r8169: load Realtek PHY driver module before r8169 · 6e09bef3
      Heiner Kallweit 提交于
      [ Upstream commit 11287b693d03830010356339e4ceddf47dee34fa ]
      
      This soft dependency works around an issue where sometimes the genphy
      driver is used instead of the dedicated PHY driver. The root cause of
      the issue isn't clear yet. People reported the unloading/re-loading
      module r8169 helps, and also configuring this soft dependency in
      the modprobe config files. Important just seems to be that the
      realtek module is loaded before r8169.
      
      Once this has been applied preliminary fix 38af4b90 ("net: phy:
      add workaround for issue where PHY driver doesn't bind to the device")
      will be removed.
      
      Fixes: f1e911d5 ("r8169: add basic phylib support")
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e09bef3
    • W
      ip: on queued skb use skb_header_pointer instead of pskb_may_pull · eb02c17f
      Willem de Bruijn 提交于
      [ Upstream commit 4a06fa67c4da20148803525151845276cdb995c1 ]
      
      Commit 2efd4fca ("ip: in cmsg IP(V6)_ORIGDSTADDR call
      pskb_may_pull") avoided a read beyond the end of the skb linear
      segment by calling pskb_may_pull.
      
      That function can trigger a BUG_ON in pskb_expand_head if the skb is
      shared, which it is when when peeking. It can also return ENOMEM.
      
      Avoid both by switching to safer skb_header_pointer.
      
      Fixes: 2efd4fca ("ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eb02c17f
    • W
      bonding: update nest level on unlink · d2898aae
      Willem de Bruijn 提交于
      [ Upstream commit 001e465f09a18857443489a57e74314a3368c805 ]
      
      A network device stack with multiple layers of bonding devices can
      trigger a false positive lockdep warning. Adding lockdep nest levels
      fixes this. Update the level on both enslave and unlink, to avoid the
      following series of events ..
      
          ip netns add test
          ip netns exec test bash
          ip link set dev lo addr 00:11:22:33:44:55
          ip link set dev lo down
      
          ip link add dev bond1 type bond
          ip link add dev bond2 type bond
      
          ip link set dev lo master bond1
          ip link set dev bond1 master bond2
      
          ip link set dev bond1 nomaster
          ip link set dev bond2 master bond1
      
      .. from still generating a splat:
      
          [  193.652127] ======================================================
          [  193.658231] WARNING: possible circular locking dependency detected
          [  193.664350] 4.20.0 #8 Not tainted
          [  193.668310] ------------------------------------------------------
          [  193.674417] ip/15577 is trying to acquire lock:
          [  193.678897] 00000000a40e3b69 (&(&bond->stats_lock)->rlock#3/3){+.+.}, at: bond_get_stats+0x58/0x290
          [  193.687851]
          	       but task is already holding lock:
          [  193.693625] 00000000807b9d9f (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0x58/0x290
      
          [..]
      
          [  193.851092]        lock_acquire+0xa7/0x190
          [  193.855138]        _raw_spin_lock_nested+0x2d/0x40
          [  193.859878]        bond_get_stats+0x58/0x290
          [  193.864093]        dev_get_stats+0x5a/0xc0
          [  193.868140]        bond_get_stats+0x105/0x290
          [  193.872444]        dev_get_stats+0x5a/0xc0
          [  193.876493]        rtnl_fill_stats+0x40/0x130
          [  193.880797]        rtnl_fill_ifinfo+0x6c5/0xdc0
          [  193.885271]        rtmsg_ifinfo_build_skb+0x86/0xe0
          [  193.890091]        rtnetlink_event+0x5b/0xa0
          [  193.894320]        raw_notifier_call_chain+0x43/0x60
          [  193.899225]        netdev_change_features+0x50/0xa0
          [  193.904044]        bond_compute_features.isra.46+0x1ab/0x270
          [  193.909640]        bond_enslave+0x141d/0x15b0
          [  193.913946]        do_set_master+0x89/0xa0
          [  193.918016]        do_setlink+0x37c/0xda0
          [  193.921980]        __rtnl_newlink+0x499/0x890
          [  193.926281]        rtnl_newlink+0x48/0x70
          [  193.930238]        rtnetlink_rcv_msg+0x171/0x4b0
          [  193.934801]        netlink_rcv_skb+0xd1/0x110
          [  193.939103]        rtnetlink_rcv+0x15/0x20
          [  193.943151]        netlink_unicast+0x3b5/0x520
          [  193.947544]        netlink_sendmsg+0x2fd/0x3f0
          [  193.951942]        sock_sendmsg+0x38/0x50
          [  193.955899]        ___sys_sendmsg+0x2ba/0x2d0
          [  193.960205]        __x64_sys_sendmsg+0xad/0x100
          [  193.964687]        do_syscall_64+0x5a/0x460
          [  193.968823]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 7e2556e4 ("bonding: avoid lockdep confusion in bond_get_stats()")
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2898aae
    • H
      r8169: don't try to read counters if chip is in a PCI power-save state · d976151a
      Heiner Kallweit 提交于
      [ Upstream commit 10262b0b53666cbc506989b17a3ead1e9c3b43b4 ]
      
      Avoid log spam caused by trying to read counters from the chip whilst
      it is in a PCI power-save state.
      
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=107421
      
      Fixes: 1ef7286e ("r8169: Dereference MMIO address immediately before use")
      Signed-off-by: NHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d976151a
    • C
      smc: move unhash as early as possible in smc_release() · 8dc262df
      Cong Wang 提交于
      [ Upstream commit 26d92e951fe0a44ee4aec157cabb65a818cc8151 ]
      
      In smc_release() we release smc->clcsock before unhash the smc
      sock, but a parallel smc_diag_dump() may be still reading
      smc->clcsock, therefore this could cause a use-after-free as
      reported by syzbot.
      
      Reported-and-tested-by: syzbot+fbd1e5476e4c94c7b34e@syzkaller.appspotmail.com
      Fixes: 51f1de79 ("net/smc: replace sock_put worker by socket refcounting")
      Cc: Ursula Braun <ubraun@linux.ibm.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Reported-by: syzbot+0bf2e01269f1274b4b03@syzkaller.appspotmail.com
      Reported-by: syzbot+e3132895630f957306bc@syzkaller.appspotmail.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dc262df
    • B
      lan743x: Remove phy_read from link status change function · f352903d
      Bryan Whitehead 提交于
      [ Upstream commit a0071840d2040ea1b27e5a008182b09b88defc15 ]
      
      It has been noticed that some phys do not have the registers
      required by the previous implementation.
      
      To fix this, instead of using phy_read, the required information
      is extracted from the phy_device structure.
      
      fixes: 23f0703c ("lan743x: Add main source files for new lan743x driver")
      Signed-off-by: NBryan Whitehead <Bryan.Whitehead@microchip.com>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f352903d
    • S
      tun: publish tfile after it's fully initialized · 08be4b72
      Stanislav Fomichev 提交于
      [ Upstream commit 0b7959b6257322f7693b08a459c505d4938646f2 ]
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000d1
      Call Trace:
       ? napi_gro_frags+0xa7/0x2c0
       tun_get_user+0xb50/0xf20
       tun_chr_write_iter+0x53/0x70
       new_sync_write+0xff/0x160
       vfs_write+0x191/0x1e0
       __x64_sys_write+0x5e/0xd0
       do_syscall_64+0x47/0xf0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      I think there is a subtle race between sending a packet via tap and
      attaching it:
      
      CPU0:                    CPU1:
      tun_chr_ioctl(TUNSETIFF)
        tun_set_iff
          tun_attach
            rcu_assign_pointer(tfile->tun, tun);
                               tun_fops->write_iter()
                                 tun_chr_write_iter
                                   tun_napi_alloc_frags
                                     napi_get_frags
                                       napi->skb = napi_alloc_skb
            tun_napi_init
              netif_napi_add
                napi->skb = NULL
                                    napi->skb is NULL here
                                    napi_gro_frags
                                      napi_frags_skb
      				  skb = napi->skb
      				  skb_reset_mac_header(skb)
      				  panic()
      
      Move rcu_assign_pointer(tfile->tun) and rcu_assign_pointer(tun->tfiles) to
      be the last thing we do in tun_attach(); this should guarantee that when we
      call tun_get() we always get an initialized object.
      
      v2 changes:
      * remove extra napi_mutex locks/unlocks for napi operations
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver")
      Signed-off-by: NStanislav Fomichev <sdf@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08be4b72
    • Y
      tcp: change txhash on SYN-data timeout · d7fe54c1
      Yuchung Cheng 提交于
      [ Upstream commit c5715b8fabfca0ef85903f8bad2189940ed41cc8 ]
      
      Previously upon SYN timeouts the sender recomputes the txhash to
      try a different path. However this does not apply on the initial
      timeout of SYN-data (active Fast Open). Therefore an active IPv6
      Fast Open connection may incur one second RTO penalty to take on
      a new path after the second SYN retransmission uses a new flow label.
      
      This patch removes this undesirable behavior so Fast Open changes
      the flow label just like the regular connections. This also helps
      avoid falsely disabling Fast Open on the sender which triggers
      after two consecutive SYN timeouts on Fast Open.
      Signed-off-by: NYuchung Cheng <ycheng@google.com>
      Reviewed-by: NNeal Cardwell <ncardwell@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7fe54c1
    • J
      packet: Do not leak dev refcounts on error exit · 3dc241b8
      Jason Gunthorpe 提交于
      [ Upstream commit d972f3dce8d161e2142da0ab1ef25df00e2f21a9 ]
      
      'dev' is non NULL when the addr_len check triggers so it must goto a label
      that does the dev_put otherwise dev will have a leaked refcount.
      
      This bug causes the ib_ipoib module to become unloadable when using
      systemd-network as it triggers this check on InfiniBand links.
      
      Fixes: 99137b7888f4 ("packet: validate address length")
      Reported-by: NLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3dc241b8
    • J
      net: bridge: fix a bug on using a neighbour cache entry without checking its state · 54cbcff8
      JianJhen Chen 提交于
      [ Upstream commit 4c84edc11b76590859b1e45dd676074c59602dc4 ]
      
      When handling DNAT'ed packets on a bridge device, the neighbour cache entry
      from lookup was used without checking its state. It means that a cache entry
      in the NUD_STALE state will be used directly instead of entering the NUD_DELAY
      state to confirm the reachability of the neighbor.
      
      This problem becomes worse after commit 2724680b ("neigh: Keep neighbour
      cache entries if number of them is small enough."), since all neighbour cache
      entries in the NUD_STALE state will be kept in the neighbour table as long as
      the number of cache entries does not exceed the value specified in gc_thresh1.
      
      This commit validates the state of a neighbour cache entry before using
      the entry.
      Signed-off-by: NJianJhen Chen <kchen@synology.com>
      Reviewed-by: NJinLin Chen <jlchen@synology.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      54cbcff8
    • E
      ipv6: fix kernel-infoleak in ipv6_local_error() · c0e1392e
      Eric Dumazet 提交于
      [ Upstream commit 7d033c9f6a7fd3821af75620a0257db87c2b552a ]
      
      This patch makes sure the flow label in the IPv6 header
      forged in ipv6_local_error() is initialized.
      
      BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
      CPU: 1 PID: 24675 Comm: syz-executor1 Not tainted 4.20.0-rc7+ #4
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x173/0x1d0 lib/dump_stack.c:113
       kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613
       kmsan_internal_check_memory+0x455/0xb00 mm/kmsan/kmsan.c:675
       kmsan_copy_to_user+0xab/0xc0 mm/kmsan/kmsan_hooks.c:601
       _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
       copy_to_user include/linux/uaccess.h:177 [inline]
       move_addr_to_user+0x2e9/0x4f0 net/socket.c:227
       ___sys_recvmsg+0x5d7/0x1140 net/socket.c:2284
       __sys_recvmsg net/socket.c:2327 [inline]
       __do_sys_recvmsg net/socket.c:2337 [inline]
       __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
       __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
       do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      RIP: 0033:0x457ec9
      Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f8750c06c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457ec9
      RDX: 0000000000002000 RSI: 0000000020000400 RDI: 0000000000000005
      RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f8750c076d4
      R13: 00000000004c4a60 R14: 00000000004d8140 R15: 00000000ffffffff
      
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:219 [inline]
       kmsan_internal_chain_origin+0x134/0x230 mm/kmsan/kmsan.c:439
       __msan_chain_origin+0x70/0xe0 mm/kmsan/kmsan_instr.c:200
       ipv6_recv_error+0x1e3f/0x1eb0 net/ipv6/datagram.c:475
       udpv6_recvmsg+0x398/0x2ab0 net/ipv6/udp.c:335
       inet_recvmsg+0x4fb/0x600 net/ipv4/af_inet.c:830
       sock_recvmsg_nosec net/socket.c:794 [inline]
       sock_recvmsg+0x1d1/0x230 net/socket.c:801
       ___sys_recvmsg+0x4d5/0x1140 net/socket.c:2278
       __sys_recvmsg net/socket.c:2327 [inline]
       __do_sys_recvmsg net/socket.c:2337 [inline]
       __se_sys_recvmsg+0x2fa/0x450 net/socket.c:2334
       __x64_sys_recvmsg+0x4a/0x70 net/socket.c:2334
       do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:204 [inline]
       kmsan_internal_poison_shadow+0x92/0x150 mm/kmsan/kmsan.c:158
       kmsan_kmalloc+0xa6/0x130 mm/kmsan/kmsan_hooks.c:176
       kmsan_slab_alloc+0xe/0x10 mm/kmsan/kmsan_hooks.c:185
       slab_post_alloc_hook mm/slab.h:446 [inline]
       slab_alloc_node mm/slub.c:2759 [inline]
       __kmalloc_node_track_caller+0xe18/0x1030 mm/slub.c:4383
       __kmalloc_reserve net/core/skbuff.c:137 [inline]
       __alloc_skb+0x309/0xa20 net/core/skbuff.c:205
       alloc_skb include/linux/skbuff.h:998 [inline]
       ipv6_local_error+0x1a7/0x9e0 net/ipv6/datagram.c:334
       __ip6_append_data+0x129f/0x4fd0 net/ipv6/ip6_output.c:1311
       ip6_make_skb+0x6cc/0xcf0 net/ipv6/ip6_output.c:1775
       udpv6_sendmsg+0x3f8e/0x45d0 net/ipv6/udp.c:1384
       inet_sendmsg+0x54a/0x720 net/ipv4/af_inet.c:798
       sock_sendmsg_nosec net/socket.c:621 [inline]
       sock_sendmsg net/socket.c:631 [inline]
       __sys_sendto+0x8c4/0xac0 net/socket.c:1788
       __do_sys_sendto net/socket.c:1800 [inline]
       __se_sys_sendto+0x107/0x130 net/socket.c:1796
       __x64_sys_sendto+0x6e/0x90 net/socket.c:1796
       do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291
       entry_SYSCALL_64_after_hwframe+0x63/0xe7
      
      Bytes 4-7 of 28 are uninitialized
      Memory access of size 28 starts at ffff8881937bfce0
      Data copied to user address 0000000020000000
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c0e1392e
    • M
      arm64: Don't trap host pointer auth use to EL2 · d29d3891
      Mark Rutland 提交于
      [ Upstream commit b3669b1e1c09890d61109a1a8ece2c5b66804714 ]
      
      To allow EL0 (and/or EL1) to use pointer authentication functionality,
      we must ensure that pointer authentication instructions and accesses to
      pointer authentication keys are not trapped to EL2.
      
      This patch ensures that HCR_EL2 is configured appropriately when the
      kernel is booted at EL2. For non-VHE kernels we set HCR_EL2.{API,APK},
      ensuring that EL1 can access keys and permit EL0 use of instructions.
      For VHE kernels host EL0 (TGE && E2H) is unaffected by these settings,
      and it doesn't matter how we configure HCR_EL2.{API,APK}, so we don't
      bother setting them.
      
      This does not enable support for KVM guests, since KVM manages HCR_EL2
      itself when running VMs.
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NKristina Martsenko <kristina.martsenko@arm.com>
      Acked-by: NChristoffer Dall <christoffer.dall@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: kvmarm@lists.cs.columbia.edu
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d29d3891
    • M
      arm64/kvm: consistently handle host HCR_EL2 flags · a31edd1c
      Mark Rutland 提交于
      [ Upstream commit 4eaed6aa2c628101246bcabc91b203bfac1193f8 ]
      
      In KVM we define the configuration of HCR_EL2 for a VHE HOST in
      HCR_HOST_VHE_FLAGS, but we don't have a similar definition for the
      non-VHE host flags, and open-code HCR_RW. Further, in head.S we
      open-code the flags for VHE and non-VHE configurations.
      
      In future, we're going to want to configure more flags for the host, so
      lets add a HCR_HOST_NVHE_FLAGS defintion, and consistently use both
      HCR_HOST_VHE_FLAGS and HCR_HOST_NVHE_FLAGS in the kvm code and head.S.
      
      We now use mov_q to generate the HCR_EL2 value, as we use when
      configuring other registers in head.S.
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NRichard Henderson <richard.henderson@linaro.org>
      Signed-off-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NKristina Martsenko <kristina.martsenko@arm.com>
      Reviewed-by: NChristoffer Dall <christoffer.dall@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: kvmarm@lists.cs.columbia.edu
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      a31edd1c
    • V
      scsi: target: iscsi: cxgbit: fix csk leak · a200574d
      Varun Prakash 提交于
      [ Upstream commit ed076c55b359cc9982ca8b065bcc01675f7365f6 ]
      
      In case of arp failure call cxgbit_put_csk() to free csk.
      Signed-off-by: NVarun Prakash <varun@chelsio.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      a200574d
    • V
      scsi: target: iscsi: cxgbit: fix csk leak · 25c0f7a2
      Varun Prakash 提交于
      [ Upstream commit 801df68d617e3cb831f531c99fa6003620e6b343 ]
      
      csk leak can happen if a new TCP connection gets established after
      cxgbit_accept_np() returns, to fix this leak free remaining csk in
      cxgbit_free_np().
      Signed-off-by: NVarun Prakash <varun@chelsio.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      25c0f7a2
    • S
      Revert "scsi: target: iscsi: cxgbit: fix csk leak" · ec98b3f3
      Sasha Levin 提交于
      This reverts commit c9cef2c7.
      
      A wrong commit message was used for the stable commit because of a human
      error (and duplicate commit subject lines).
      
      This patch reverts this error, and the following patches add the two
      upstream commits.
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ec98b3f3
    • L
      mmc: sdhci-msm: Disable CDR function on TX · e276420e
      Loic Poulain 提交于
      commit a89e7bcb18081c611eb6cf50edd440fa4983a71a upstream.
      
      The Clock Data Recovery (CDR) circuit allows to automatically adjust
      the RX sampling-point/phase for high frequency cards (SDR104, HS200...).
      CDR is automatically enabled during DLL configuration.
      However, according to the APQ8016 reference manual, this function
      must be disabled during TX and tuning phase in order to prevent any
      interferences during tuning challenges and unexpected phase alteration
      during TX transfers.
      
      This patch enables/disables CDR according to the current transfer mode.
      
      This fixes sporadic write transfer issues observed with some SDR104 and
      HS200 cards.
      
      Inspired by sdhci-msm downstream patch:
      https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/432516/Reported-by: NLeonid Segal <leonid.s@variscite.com>
      Reported-by: NManabu Igusa <migusa@arrowjapan.com>
      Signed-off-by: NLoic Poulain <loic.poulain@linaro.org>
      Acked-by: NAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: NGeorgi Djakov <georgi.djakov@linaro.org>
      Signed-off-by: NUlf Hansson <ulf.hansson@linaro.org>
      [georgi: backport to v4.19+]
      Signed-off-by: NGeorgi Djakov <georgi.djakov@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e276420e
    • F
      netfilter: nf_conncount: fix argument order to find_next_bit · 6567515e
      Florian Westphal 提交于
      commit a007232066f6839d6f256bab21e825d968f1a163 upstream.
      
      Size and 'next bit' were swapped, this bug could cause worker to
      reschedule itself even if system was idle.
      
      Fixes: 5c789e13 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
      Reviewed-by: NShawn Bohrer <sbohrer@cloudflare.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6567515e
    • P
      netfilter: nf_conncount: speculative garbage collection on empty lists · b01b9241
      Pablo Neira Ayuso 提交于
      commit c80f10bc973af2ace6b1414724eeff61eaa71837 upstream.
      
      Instead of removing a empty list node that might be reintroduced soon
      thereafter, tentatively place the empty list node on the list passed to
      tree_nodes_free(), then re-check if the list is empty again before erasing
      it from the tree.
      
      [ Florian: rebase on top of pending nf_conncount fixes ]
      
      Fixes: 5c789e13 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
      Reviewed-by: NShawn Bohrer <sbohrer@cloudflare.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b01b9241
    • P
      netfilter: nf_conncount: move all list iterations under spinlock · aea1d195
      Pablo Neira Ayuso 提交于
      commit 2f971a8f425545da52ca0e6bee81f5b1ea0ccc5f upstream.
      
      Two CPUs may race to remove a connection from the list, the existing
      conn->dead will result in a use-after-free. Use the per-list spinlock to
      protect list iterations.
      
      As all accesses to the list now happen while holding the per-list lock,
      we no longer need to delay free operations with rcu.
      
      Joint work with Florian.
      
      Fixes: 5c789e13 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
      Reviewed-by: NShawn Bohrer <sbohrer@cloudflare.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aea1d195
    • F
      netfilter: nf_conncount: merge lookup and add functions · bdc6c893
      Florian Westphal 提交于
      commit df4a902509766897f7371fdfa4c3bf8bc321b55d upstream.
      
      'lookup' is always followed by 'add'.
      Merge both and make the list-walk part of nf_conncount_add().
      
      This also avoids one unneeded unlock/re-lock pair.
      
      Extra care needs to be taken in count_tree, as we only hold rcu
      read lock, i.e. we can only insert to an existing tree node after
      acquiring its lock and making sure it has a nonzero count.
      
      As a zero count should be rare, just fall back to insert_tree()
      (which acquires tree lock).
      
      This issue and its solution were pointed out by Shawn Bohrer
      during patch review.
      Reviewed-by: NShawn Bohrer <sbohrer@cloudflare.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bdc6c893
    • F
      netfilter: nf_conncount: restart search when nodes have been erased · 13c63942
      Florian Westphal 提交于
      commit e8cfb372b38a1b8979aa7f7631fb5e7b11c3793c upstream.
      
      Shawn Bohrer reported a following crash:
       |RIP: 0010:rb_erase+0xae/0x360
       [..]
       Call Trace:
        nf_conncount_destroy+0x59/0xc0 [nf_conncount]
        cleanup_match+0x45/0x70 [ip_tables]
        ...
      
      Shawn tracked this down to bogus 'parent' pointer:
      Problem is that when we insert a new node, then there is a chance that
      the 'parent' that we found was also passed to tree_nodes_free() (because
      that node was empty) for erase+free.
      
      Instead of trying to be clever and detect when this happens, restart
      the search if we have evicted one or more nodes.  To prevent frequent
      restarts, do not perform gc on the second round.
      
      Also, unconditionally schedule the gc worker.
      The condition
      
        gc_count > ARRAY_SIZE(gc_nodes))
      
      cannot be true unless tree grows very large, as the height of the tree
      will be low even with hundreds of nodes present.
      
      Fixes: 5c789e13 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
      Reported-by: NShawn Bohrer <sbohrer@cloudflare.com>
      Reviewed-by: NShawn Bohrer <sbohrer@cloudflare.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13c63942
    • F
      netfilter: nf_conncount: split gc in two phases · d6b3ff02
      Florian Westphal 提交于
      commit f7fcc98dfc2d136722007fec0debbed761679b94 upstream.
      
      The lockless workqueue garbage collector can race with packet path
      garbage collector to delete list nodes, as it calls tree_nodes_free()
      with the addresses of nodes that might have been free'd already from
      another cpu.
      
      To fix this, split gc into two phases.
      
      One phase to perform gc on the connections: From a locking perspective,
      this is the same as count_tree(): we hold rcu lock, but we do not
      change the tree, we only change the nodes' contents.
      
      The second phase acquires the tree lock and reaps empty nodes.
      This avoids a race condition of the garbage collection vs.  packet path:
      If a node has been free'd already, the second phase won't find it anymore.
      
      This second phase is, from locking perspective, same as insert_tree().
      
      The former only modifies nodes (list content, count), latter modifies
      the tree itself (rb_erase or rb_insert).
      
      Fixes: 5c789e13 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
      Reviewed-by: NShawn Bohrer <sbohrer@cloudflare.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d6b3ff02
    • F
      netfilter: nf_conncount: don't skip eviction when age is negative · ef68fdb5
      Florian Westphal 提交于
      commit 4cd273bb91b3001f623f516ec726c49754571b1a upstream.
      
      age is signed integer, so result can be negative when the timestamps
      have a large delta.  In this case we want to discard the entry.
      
      Instead of using age >= 2 || age < 0, just make it unsigned.
      
      Fixes: b36e4523 ("netfilter: nf_conncount: fix garbage collection confirm race")
      Reviewed-by: NShawn Bohrer <sbohrer@cloudflare.com>
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef68fdb5
    • S
      netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with CONNCOUNT_SLOTS · c5cbe95a
      Shawn Bohrer 提交于
      commit c78e7818f16f687389174c4569243abbec8dc68f upstream.
      
      Most of the time these were the same value anyway, but when
      CONFIG_LOCKDEP was enabled we would use a smaller number of locks to
      reduce overhead.  Unfortunately having two values is confusing and not
      worth the complexity.
      
      This fixes a bug where tree_gc_worker() would only GC up to
      CONNCOUNT_LOCK_SLOTS trees which meant when CONFIG_LOCKDEP was enabled
      not all trees would be GCed by tree_gc_worker().
      
      Fixes: 5c789e13 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NShawn Bohrer <sbohrer@cloudflare.com>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5cbe95a
    • O
      can: gw: ensure DLC boundaries after CAN frame modification · 8db82a6f
      Oliver Hartkopp 提交于
      commit 0aaa81377c5a01f686bcdb8c7a6929a7bf330c68 upstream.
      
      Muyu Yu provided a POC where user root with CAP_NET_ADMIN can create a CAN
      frame modification rule that makes the data length code a higher value than
      the available CAN frame data size. In combination with a configured checksum
      calculation where the result is stored relatively to the end of the data
      (e.g. cgw_csum_xor_rel) the tail of the skb (e.g. frag_list pointer in
      skb_shared_info) can be rewritten which finally can cause a system crash.
      
      Michael Kubecek suggested to drop frames that have a DLC exceeding the
      available space after the modification process and provided a patch that can
      handle CAN FD frames too. Within this patch we also limit the length for the
      checksum calculations to the maximum of Classic CAN data length (8).
      
      CAN frames that are dropped by these additional checks are counted with the
      CGW_DELETED counter which indicates misconfigurations in can-gw rules.
      
      This fixes CVE-2019-3701.
      Reported-by: NMuyu Yu <ieatmuttonchuan@gmail.com>
      Reported-by: NMarcus Meissner <meissner@suse.de>
      Suggested-by: NMichal Kubecek <mkubecek@suse.cz>
      Tested-by: NMuyu Yu <ieatmuttonchuan@gmail.com>
      Tested-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
      Cc: linux-stable <stable@vger.kernel.org> # >= v3.2
      Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8db82a6f
    • D
      tty: Don't hold ldisc lock in tty_reopen() if ldisc present · a13520e0
      Dmitry Safonov 提交于
      commit d3736d82e8169768218ee0ef68718875918091a0 upstream.
      
      Try to get reference for ldisc during tty_reopen().
      If ldisc present, we don't need to do tty_ldisc_reinit() and lock the
      write side for line discipline semaphore.
      Effectively, it optimizes fast-path for tty_reopen(), but more
      importantly it won't interrupt ongoing IO on the tty as no ldisc change
      is needed.
      Fixes user-visible issue when tty_reopen() interrupted login process for
      user with a long password, observed and reported by Lukas.
      
      Fixes: c96cf923a98d ("tty: Don't block on IO when ldisc change is pending")
      Fixes: 83d817f41070 ("tty: Hold tty_ldisc_lock() during tty_reopen()")
      Cc: Jiri Slaby <jslaby@suse.com>
      Reported-by: NLukas F. Hartmann <lukas@mntmn.com>
      Tested-by: NLukas F. Hartmann <lukas@mntmn.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a13520e0
    • D
      tty: Simplify tty->count math in tty_reopen() · a42c9786
      Dmitry Safonov 提交于
      commit cf62a1a13749db0d32b5cdd800ea91a4087319de upstream.
      
      As notted by Jiri, tty_ldisc_reinit() shouldn't rely on tty counter.
      Simplify math by increasing the counter after reinit success.
      
      Cc: Jiri Slaby <jslaby@suse.com>
      Link: lkml.kernel.org/r/<20180829022353.23568-2-dima@arista.com>
      Suggested-by: NJiri Slaby <jslaby@suse.com>
      Reviewed-by: NJiri Slaby <jslaby@suse.cz>
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a42c9786
    • D
      tty: Hold tty_ldisc_lock() during tty_reopen() · e6a4caa0
      Dmitry Safonov 提交于
      commit 83d817f41070c48bc3eb7ec18e43000a548fca5c upstream.
      
      tty_ldisc_reinit() doesn't race with neither tty_ldisc_hangup()
      nor set_ldisc() nor tty_ldisc_release() as they use tty lock.
      But it races with anyone who expects line discipline to be the same
      after hoding read semaphore in tty_ldisc_ref().
      
      We've seen the following crash on v4.9.108 stable:
      
      BUG: unable to handle kernel paging request at 0000000000002260
      IP: [..] n_tty_receive_buf_common+0x5f/0x86d
      Workqueue: events_unbound flush_to_ldisc
      Call Trace:
       [..] n_tty_receive_buf2
       [..] tty_ldisc_receive_buf
       [..] flush_to_ldisc
       [..] process_one_work
       [..] worker_thread
       [..] kthread
       [..] ret_from_fork
      
      tty_ldisc_reinit() should be called with ldisc_sem hold for writing,
      which will protect any reader against line discipline changes.
      
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: stable@vger.kernel.org # b027e229 ("tty: fix data race between tty_init_dev and flush of buf")
      Reviewed-by: NJiri Slaby <jslaby@suse.cz>
      Reported-by: syzbot+3aa9784721dfb90e984d@syzkaller.appspotmail.com
      Tested-by: NMark Rutland <mark.rutland@arm.com>
      Tested-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Tested-by: NTycho Andersen <tycho@tycho.ws>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6a4caa0
    • D
      tty/ldsem: Wake up readers after timed out down_write() · 028c13f7
      Dmitry Safonov 提交于
      commit 231f8fd0cca078bd4396dd7e380db813ac5736e2 upstream.
      
      ldsem_down_read() will sleep if there is pending writer in the queue.
      If the writer times out, readers in the queue should be woken up,
      otherwise they may miss a chance to acquire the semaphore until the last
      active reader will do ldsem_up_read().
      
      There was a couple of reports where there was one active reader and
      other readers soft locked up:
        Showing all locks held in the system:
        2 locks held by khungtaskd/17:
         #0:  (rcu_read_lock){......}, at: watchdog+0x124/0x6d1
         #1:  (tasklist_lock){.+.+..}, at: debug_show_all_locks+0x72/0x2d3
        2 locks held by askfirst/123:
         #0:  (&tty->ldisc_sem){.+.+.+}, at: ldsem_down_read+0x46/0x58
         #1:  (&ldata->atomic_read_lock){+.+...}, at: n_tty_read+0x115/0xbe4
      
      Prevent readers wait for active readers to release ldisc semaphore.
      
      Link: lkml.kernel.org/r/20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.intel.com
      Link: lkml.kernel.org/r/20180907045041.GF1110@shao2-debian
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Reported-by: Nkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: NDmitry Safonov <dima@arista.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      028c13f7
  2. 17 1月, 2019 4 次提交
    • G
      Linux 4.19.16 · 9c5931b6
      Greg Kroah-Hartman 提交于
      9c5931b6
    • F
      Btrfs: use nofs context when initializing security xattrs to avoid deadlock · 7a1b9b76
      Filipe Manana 提交于
      commit 827aa18e7b903c5ff3b3cd8fec328a99b1dbd411 upstream.
      
      When initializing the security xattrs, we are holding a transaction handle
      therefore we need to use a GFP_NOFS context in order to avoid a deadlock
      with reclaim in case it's triggered.
      
      Fixes: 39a27ec1 ("btrfs: use GFP_KERNEL for xattr and acl allocations")
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7a1b9b76
    • F
      Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation · 79aa5c0d
      Filipe Manana 提交于
      commit 9a6f209e36500efac51528132a3e3083586eda5f upstream.
      
      If the quota enable and snapshot creation ioctls are called concurrently
      we can get into a deadlock where the task enabling quotas will deadlock
      on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
      twice, or the task creating a snapshot tries to commit the transaction
      while the task enabling quota waits for the former task to commit the
      transaction while holding the mutex. The following time diagrams show how
      both cases happen.
      
      First scenario:
      
                 CPU 0                                    CPU 1
      
       btrfs_ioctl()
        btrfs_ioctl_quota_ctl()
         btrfs_quota_enable()
          mutex_lock(fs_info->qgroup_ioctl_lock)
          btrfs_start_transaction()
      
                                                   btrfs_ioctl()
                                                    btrfs_ioctl_snap_create_v2
                                                     create_snapshot()
                                                      --> adds snapshot to the
                                                          list pending_snapshots
                                                          of the current
                                                          transaction
      
          btrfs_commit_transaction()
           create_pending_snapshots()
             create_pending_snapshot()
              qgroup_account_snapshot()
               btrfs_qgroup_inherit()
      	   mutex_lock(fs_info->qgroup_ioctl_lock)
      	    --> deadlock, mutex already locked
      	        by this task at
      		btrfs_quota_enable()
      
      Second scenario:
      
                 CPU 0                                    CPU 1
      
       btrfs_ioctl()
        btrfs_ioctl_quota_ctl()
         btrfs_quota_enable()
          mutex_lock(fs_info->qgroup_ioctl_lock)
          btrfs_start_transaction()
      
                                                   btrfs_ioctl()
                                                    btrfs_ioctl_snap_create_v2
                                                     create_snapshot()
                                                      --> adds snapshot to the
                                                          list pending_snapshots
                                                          of the current
                                                          transaction
      
                                                      btrfs_commit_transaction()
                                                       --> waits for task at
                                                           CPU 0 to release
                                                           its transaction
                                                           handle
      
          btrfs_commit_transaction()
           --> sees another task started
               the transaction commit first
           --> releases its transaction
               handle
           --> waits for the transaction
               commit to be completed by
               the task at CPU 1
      
                                                       create_pending_snapshot()
                                                        qgroup_account_snapshot()
                                                         btrfs_qgroup_inherit()
                                                          mutex_lock(fs_info->qgroup_ioctl_lock)
                                                           --> deadlock, task at CPU 0
                                                               has the mutex locked but
                                                               it is waiting for us to
                                                               finish the transaction
                                                               commit
      
      So fix this by setting the quota enabled flag in fs_info after committing
      the transaction at btrfs_quota_enable(). This ends up serializing quota
      enable and snapshot creation as if the snapshot creation happened just
      before the quota enable request. The quota rescan task, scheduled after
      committing the transaction in btrfs_quote_enable(), will do the accounting.
      
      Fixes: 6426c7ad ("btrfs: qgroup: Fix qgroup accounting when creating snapshot")
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      79aa5c0d
    • F
      Btrfs: fix access to available allocation bits when starting balance · 829431a2
      Filipe Manana 提交于
      commit 5a8067c0d17feb7579db0476191417b441a8996e upstream.
      
      The available allocation bits members from struct btrfs_fs_info are
      protected by a sequence lock, and when starting balance we access them
      incorrectly in two different ways:
      
      1) In the read sequence lock loop at btrfs_balance() we use the values we
         read from fs_info->avail_*_alloc_bits and we can immediately do actions
         that have side effects and can not be undone (printing a message and
         jumping to a label). This is wrong because a retry might be needed, so
         our actions must not have side effects and must be repeatable as long
         as read_seqretry() returns a non-zero value. In other words, we were
         essentially ignoring the sequence lock;
      
      2) Right below the read sequence lock loop, we were reading the values
         from avail_metadata_alloc_bits and avail_data_alloc_bits without any
         protection from concurrent writers, that is, reading them outside of
         the read sequence lock critical section.
      
      So fix this by making sure we only read the available allocation bits
      while in a read sequence lock critical section and that what we do in the
      critical section is repeatable (has nothing that can not be undone) so
      that any eventual retry that is needed is handled properly.
      
      Fixes: de98ced9 ("Btrfs: use seqlock to protect fs_info->avail_{data, metadata, system}_alloc_bits")
      Fixes: 14506127 ("btrfs: fix a bogus warning when converting only data or metadata")
      Reviewed-by: NNikolay Borisov <nborisov@suse.com>
      Signed-off-by: NFilipe Manana <fdmanana@suse.com>
      Reviewed-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NDavid Sterba <dsterba@suse.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      829431a2