1. 03 4月, 2019 21 次提交
    • E
      tcp: do not use ipv6 header for ipv4 flow · 7115df61
      Eric Dumazet 提交于
      [ Upstream commit 89e4130939a20304f4059ab72179da81f5347528 ]
      
      When a dual stack tcp listener accepts an ipv4 flow,
      it should not attempt to use an ipv6 header or tcp_v6_iif() helper.
      
      Fixes: 1397ed35 ("ipv6: add flowinfo for tcp6 pkt_options for all cases")
      Fixes: df3687ff ("ipv6: add the IPV6_FL_F_REFLECT flag to IPV6_FL_A_GET")
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7115df61
    • X
      sctp: use memdup_user instead of vmemdup_user · cab576f1
      Xin Long 提交于
      [ Upstream commit ef82bcfa671b9a635bab5fa669005663d8b177c5 ]
      
      In sctp_setsockopt_bindx()/__sctp_setsockopt_connectx(), it allocates
      memory with addrs_size which is passed from userspace. We used flag
      GFP_USER to put some more restrictions on it in Commit cacc0621
      ("sctp: use GFP_USER for user-controlled kmalloc").
      
      However, since Commit c981f254 ("sctp: use vmemdup_user() rather
      than badly open-coding memdup_user()"), vmemdup_user() has been used,
      which doesn't check GFP_USER flag when goes to vmalloc_*(). So when
      addrs_size is a huge value, it could exhaust memory and even trigger
      oom killer.
      
      This patch is to use memdup_user() instead, in which GFP_USER would
      work to limit the memory allocation with a huge addrs_size.
      
      Note we can't fix it by limiting 'addrs_size', as there's no demand
      for it from RFC.
      
      Reported-by: syzbot+ec1b7575afef85a0e5ca@syzkaller.appspotmail.com
      Fixes: c981f254 ("sctp: use vmemdup_user() rather than badly open-coding memdup_user()")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cab576f1
    • X
      sctp: get sctphdr by offset in sctp_compute_cksum · 97265479
      Xin Long 提交于
      [ Upstream commit 273160ffc6b993c7c91627f5a84799c66dfe4dee ]
      
      sctp_hdr(skb) only works when skb->transport_header is set properly.
      
      But in Netfilter, skb->transport_header for ipv6 is not guaranteed
      to be right value for sctphdr. It would cause to fail to check the
      checksum for sctp packets.
      
      So fix it by using offset, which is always right in all places.
      
      v1->v2:
        - Fix the changelog.
      
      Fixes: e6d8b64b ("net: sctp: fix and consolidate SCTP checksumming code")
      Reported-by: NLi Shuang <shuali@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97265479
    • H
      rhashtable: Still do rehash when we get EEXIST · cf86f7a9
      Herbert Xu 提交于
      [ Upstream commit 408f13ef358aa5ad56dc6230c2c7deb92cf462b1 ]
      
      As it stands if a shrink is delayed because of an outstanding
      rehash, we will go into a rescheduling loop without ever doing
      the rehash.
      
      This patch fixes this by still carrying out the rehash and then
      rescheduling so that we can shrink after the completion of the
      rehash should it still be necessary.
      
      The return value of EEXIST captures this case and other cases
      (e.g., another thread expanded/rehashed the table at the same
      time) where we should still proceed with the rehash.
      
      Fixes: da20420f ("rhashtable: Add nested tables")
      Reported-by: NJosh Elsasser <jelsasser@appneta.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Tested-by: NJosh Elsasser <jelsasser@appneta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf86f7a9
    • M
      packets: Always register packet sk in the same order · 69cea7cf
      Maxime Chevallier 提交于
      [ Upstream commit a4dc6a49156b1f8d6e17251ffda17c9e6a5db78a ]
      
      When using fanouts with AF_PACKET, the demux functions such as
      fanout_demux_cpu will return an index in the fanout socket array, which
      corresponds to the selected socket.
      
      The ordering of this array depends on the order the sockets were added
      to a given fanout group, so for FANOUT_CPU this means sockets are bound
      to cpus in the order they are configured, which is OK.
      
      However, when stopping then restarting the interface these sockets are
      bound to, the sockets are reassigned to the fanout group in the reverse
      order, due to the fact that they were inserted at the head of the
      interface's AF_PACKET socket list.
      
      This means that traffic that was directed to the first socket in the
      fanout group is now directed to the last one after an interface restart.
      
      In the case of FANOUT_CPU, traffic from CPU0 will be directed to the
      socket that used to receive traffic from the last CPU after an interface
      restart.
      
      This commit introduces a helper to add a socket at the tail of a list,
      then uses it to register AF_PACKET sockets.
      
      Note that this changes the order in which sockets are listed in /proc and
      with sock_diag.
      
      Fixes: dc99f600 ("packet: Add fanout support")
      Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69cea7cf
    • Y
      net-sysfs: call dev_hold if kobject_init_and_add success · d9d215be
      YueHaibing 提交于
      [ Upstream commit a3e23f719f5c4a38ffb3d30c8d7632a4ed8ccd9e ]
      
      In netdev_queue_add_kobject and rx_queue_add_kobject,
      if sysfs_create_group failed, kobject_put will call
      netdev_queue_release to decrease dev refcont, however
      dev_hold has not be called. So we will see this while
      unregistering dev:
      
      unregister_netdevice: waiting for bcsh0 to become free. Usage count = -1
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Fixes: d0d66837 ("net: don't decrement kobj reference count on init failure")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9d215be
    • A
      net: stmmac: fix memory corruption with large MTUs · 8dcf078d
      Aaro Koskinen 提交于
      [ Upstream commit 223a960c01227e4dbcb6f9fa06b47d73bda21274 ]
      
      When using 16K DMA buffers and ring mode, the DES3 refill is not working
      correctly as the function is using a bogus pointer for checking the
      private data. As a result stale pointers will remain in the RX descriptor
      ring, so DMA will now likely overwrite/corrupt some already freed memory.
      
      As simple reproducer, just receive some UDP traffic:
      
      	# ifconfig eth0 down; ifconfig eth0 mtu 9000; ifconfig eth0 up
      	# iperf3 -c 192.168.253.40 -u -b 0 -R
      
      If you didn't crash by now check the RX descriptors to find non-contiguous
      RX buffers:
      
      	cat /sys/kernel/debug/stmmaceth/eth0/descriptors_status
      	[...]
      	1 [0x2be5020]: 0xa3220321 0x9ffc1ffc 0x72d70082 0x130e207e
      					     ^^^^^^^^^^^^^^^^^^^^^
      	2 [0x2be5040]: 0xa3220321 0x9ffc1ffc 0x72998082 0x1311a07e
      					     ^^^^^^^^^^^^^^^^^^^^^
      
      A simple ping test will now report bad data:
      
      	# ping -s 8200 192.168.253.40
      	PING 192.168.253.40 (192.168.253.40) 8200(8228) bytes of data.
      	8208 bytes from 192.168.253.40: icmp_seq=1 ttl=64 time=1.00 ms
      	wrong data byte #8144 should be 0xd0 but was 0x88
      
      Fix the wrong pointer. Also we must refill DES3 only if the DMA buffer
      size is 16K.
      
      Fixes: 54139cf3 ("net: stmmac: adding multiple buffers for rx")
      Signed-off-by: NAaro Koskinen <aaro.koskinen@nokia.com>
      Acked-by: NJose Abreu <joabreu@synopsys.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dcf078d
    • E
      net: rose: fix a possible stack overflow · 7eeb12ed
      Eric Dumazet 提交于
      [ Upstream commit e5dcc0c3223c45c94100f05f28d8ef814db3d82c ]
      
      rose_write_internal() uses a temp buffer of 100 bytes, but a manual
      inspection showed that given arbitrary input, rose_create_facilities()
      can fill up to 110 bytes.
      
      Lets use a tailroom of 256 bytes for peace of mind, and remove
      the bounce buffer : we can simply allocate a big enough skb
      and adjust its length as needed.
      
      syzbot report :
      
      BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:352 [inline]
      BUG: KASAN: stack-out-of-bounds in rose_create_facilities net/rose/rose_subr.c:521 [inline]
      BUG: KASAN: stack-out-of-bounds in rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116
      Write of size 7 at addr ffff88808b1ffbef by task syz-executor.0/24854
      
      CPU: 0 PID: 24854 Comm: syz-executor.0 Not tainted 5.0.0+ #97
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
       kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       check_memory_region_inline mm/kasan/generic.c:185 [inline]
       check_memory_region+0x123/0x190 mm/kasan/generic.c:191
       memcpy+0x38/0x50 mm/kasan/common.c:131
       memcpy include/linux/string.h:352 [inline]
       rose_create_facilities net/rose/rose_subr.c:521 [inline]
       rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116
       rose_connect+0x7cb/0x1510 net/rose/af_rose.c:826
       __sys_connect+0x266/0x330 net/socket.c:1685
       __do_sys_connect net/socket.c:1696 [inline]
       __se_sys_connect net/socket.c:1693 [inline]
       __x64_sys_connect+0x73/0xb0 net/socket.c:1693
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x458079
      Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f47b8d9dc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458079
      RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000004
      RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f47b8d9e6d4
      R13: 00000000004be4a4 R14: 00000000004ceca8 R15: 00000000ffffffff
      
      The buggy address belongs to the page:
      page:ffffea00022c7fc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
      flags: 0x1fffc0000000000()
      raw: 01fffc0000000000 0000000000000000 ffffffff022c0101 0000000000000000
      raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88808b1ffa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff88808b1ffb00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 03
      >ffff88808b1ffb80: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 04 f3
                                                                   ^
       ffff88808b1ffc00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
       ffff88808b1ffc80: 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 01 f2 01
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7eeb12ed
    • J
      net: phy: meson-gxl: fix interrupt support · a6f0168e
      Jerome Brunet 提交于
      [ Upstream commit daa5c4d0167a308306525fd5ab9a5e18e21f4f74 ]
      
      If an interrupt is already pending when the interrupt is enabled on the
      GXL phy, no IRQ will ever be triggered.
      
      The fix is simply to make sure pending IRQs are cleared before setting
      up the irq mask.
      
      Fixes: cf127ff2 ("net: phy: meson-gxl: add interrupt support")
      Signed-off-by: NJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6f0168e
    • C
      net/packet: Set __GFP_NOWARN upon allocation in alloc_pg_vec · 85ef72d8
      Christoph Paasch 提交于
      [ Upstream commit 398f0132c14754fcd03c1c4f8e7176d001ce8ea1 ]
      
      Since commit fc62814d690c ("net/packet: fix 4gb buffer limit due to overflow check")
      one can now allocate packet ring buffers >= UINT_MAX. However, syzkaller
      found that that triggers a warning:
      
      [   21.100000] WARNING: CPU: 2 PID: 2075 at mm/page_alloc.c:4584 __alloc_pages_nod0
      [   21.101490] Modules linked in:
      [   21.101921] CPU: 2 PID: 2075 Comm: syz-executor.0 Not tainted 5.0.0 #146
      [   21.102784] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
      [   21.103887] RIP: 0010:__alloc_pages_nodemask+0x2a0/0x630
      [   21.104640] Code: fe ff ff 65 48 8b 04 25 c0 de 01 00 48 05 90 0f 00 00 41 bd 01 00 00 00 48 89 44 24 48 e9 9c fe 3
      [   21.107121] RSP: 0018:ffff88805e1cf920 EFLAGS: 00010246
      [   21.107819] RAX: 0000000000000000 RBX: ffffffff85a488a0 RCX: 0000000000000000
      [   21.108753] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000
      [   21.109699] RBP: 1ffff1100bc39f28 R08: ffffed100bcefb67 R09: ffffed100bcefb67
      [   21.110646] R10: 0000000000000001 R11: ffffed100bcefb66 R12: 000000000000000d
      [   21.111623] R13: 0000000000000000 R14: ffff88805e77d888 R15: 000000000000000d
      [   21.112552] FS:  00007f7c7de05700(0000) GS:ffff88806d100000(0000) knlGS:0000000000000000
      [   21.113612] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   21.114405] CR2: 000000000065c000 CR3: 000000005e58e006 CR4: 00000000001606e0
      [   21.115367] Call Trace:
      [   21.115705]  ? __alloc_pages_slowpath+0x21c0/0x21c0
      [   21.116362]  alloc_pages_current+0xac/0x1e0
      [   21.116923]  kmalloc_order+0x18/0x70
      [   21.117393]  kmalloc_order_trace+0x18/0x110
      [   21.117949]  packet_set_ring+0x9d5/0x1770
      [   21.118524]  ? packet_rcv_spkt+0x440/0x440
      [   21.119094]  ? lock_downgrade+0x620/0x620
      [   21.119646]  ? __might_fault+0x177/0x1b0
      [   21.120177]  packet_setsockopt+0x981/0x2940
      [   21.120753]  ? __fget+0x2fb/0x4b0
      [   21.121209]  ? packet_release+0xab0/0xab0
      [   21.121740]  ? sock_has_perm+0x1cd/0x260
      [   21.122297]  ? selinux_secmark_relabel_packet+0xd0/0xd0
      [   21.123013]  ? __fget+0x324/0x4b0
      [   21.123451]  ? selinux_netlbl_socket_setsockopt+0x101/0x320
      [   21.124186]  ? selinux_netlbl_sock_rcv_skb+0x3a0/0x3a0
      [   21.124908]  ? __lock_acquire+0x529/0x3200
      [   21.125453]  ? selinux_socket_setsockopt+0x5d/0x70
      [   21.126075]  ? __sys_setsockopt+0x131/0x210
      [   21.126533]  ? packet_release+0xab0/0xab0
      [   21.127004]  __sys_setsockopt+0x131/0x210
      [   21.127449]  ? kernel_accept+0x2f0/0x2f0
      [   21.127911]  ? ret_from_fork+0x8/0x50
      [   21.128313]  ? do_raw_spin_lock+0x11b/0x280
      [   21.128800]  __x64_sys_setsockopt+0xba/0x150
      [   21.129271]  ? lockdep_hardirqs_on+0x37f/0x560
      [   21.129769]  do_syscall_64+0x9f/0x450
      [   21.130182]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      We should allocate with __GFP_NOWARN to handle this.
      
      Cc: Kal Conley <kal.conley@dectris.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Fixes: fc62814d690c ("net/packet: fix 4gb buffer limit due to overflow check")
      Signed-off-by: NChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85ef72d8
    • P
      net: datagram: fix unbounded loop in __skb_try_recv_datagram() · 88c64f9c
      Paolo Abeni 提交于
      [ Upstream commit 0b91bce1ebfc797ff3de60c8f4a1e6219a8a3187 ]
      
      Christoph reported a stall while peeking datagram with an offset when
      busy polling is enabled. __skb_try_recv_datagram() uses as the loop
      termination condition 'queue empty'. When peeking, the socket
      queue can be not empty, even when no additional packets are received.
      
      Address the issue explicitly checking for receive queue changes,
      as currently done by __skb_wait_for_more_packets().
      
      Fixes: 2b5cd0df ("net: Change return type of sk_busy_loop from bool to void")
      Reported-and-tested-by: NChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      88c64f9c
    • D
      net: aquantia: fix rx checksum offload for UDP/TCP over IPv6 · e4ff39e1
      Dmitry Bogdanov 提交于
      [ Upstream commit a7faaa0c5dc7d091cc9f72b870d7edcdd6f43f12 ]
      
      TCP/UDP checksum validity was propagated to skb
      only if IP checksum is valid.
      But for IPv6 there is no validity as there is no checksum in IPv6.
      This patch propagates TCP/UDP checksum validity regardless of IP checksum.
      
      Fixes: 018423e9 ("net: ethernet: aquantia: Add ring support code")
      Signed-off-by: NIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: NNikita Danilov <nikita.danilov@aquantia.com>
      Signed-off-by: NDmitry Bogdanov <dmitry.bogdanov@aquantia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e4ff39e1
    • B
      mISDN: hfcpci: Test both vendor & device ID for Digium HFC4S · c4084262
      Bjorn Helgaas 提交于
      [ Upstream commit fae846e2b7124d4b076ef17791c73addf3b26350 ]
      
      The device ID alone does not uniquely identify a device.  Test both the
      vendor and device ID to make sure we don't mistakenly think some other
      vendor's 0xB410 device is a Digium HFC4S.  Also, instead of the bare hex
      ID, use the same constant (PCI_DEVICE_ID_DIGIUM_HFC4S) used in the device
      ID table.
      
      No functional change intended.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c4084262
    • F
      mac8390: Fix mmio access size probe · e0f8c06f
      Finn Thain 提交于
      [ Upstream commit bb9e5c5bcd76f4474eac3baf643d7a39f7bac7bb ]
      
      The bug that Stan reported is as follows. After a restart, a 16-bit NIC
      may be incorrectly identified as a 32-bit NIC and stop working.
      
      mac8390 slot.E: Memory length resource not found, probing
      mac8390 slot.E: Farallon EtherMac II-C (type farallon)
      mac8390 slot.E: MAC 00:00:c5:30:c2:99, IRQ 61, 32 KB shared memory at 0xfeed0000, 32-bit access.
      
      The bug never arises after a cold start and only intermittently after a
      warm start. (I didn't investigate why the bug is intermittent.)
      
      It turns out that memcpy_toio() is deprecated and memcmp_withio() also
      has issues. Replacing these calls with mmio accessors fixes the problem.
      Reported-and-tested-by: NStan Johnson <userm57@yahoo.com>
      Fixes: 2964db0f ("m68k: Mac DP8390 update")
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0f8c06f
    • X
      ipv6: make ip6_create_rt_rcu return ip6_null_entry instead of NULL · be092113
      Xin Long 提交于
      [ Upstream commit 1c87e79a002f6a159396138cd3f3ab554a2a8887 ]
      
      Jianlin reported a crash:
      
        [  381.484332] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
        [  381.619802] RIP: 0010:fib6_rule_lookup+0xa3/0x160
        [  382.009615] Call Trace:
        [  382.020762]  <IRQ>
        [  382.030174]  ip6_route_redirect.isra.52+0xc9/0xf0
        [  382.050984]  ip6_redirect+0xb6/0xf0
        [  382.066731]  icmpv6_notify+0xca/0x190
        [  382.083185]  ndisc_redirect_rcv+0x10f/0x160
        [  382.102569]  ndisc_rcv+0xfb/0x100
        [  382.117725]  icmpv6_rcv+0x3f2/0x520
        [  382.133637]  ip6_input_finish+0xbf/0x460
        [  382.151634]  ip6_input+0x3b/0xb0
        [  382.166097]  ipv6_rcv+0x378/0x4e0
      
      It was caused by the lookup function __ip6_route_redirect() returns NULL in
      fib6_rule_lookup() when ip6_create_rt_rcu() returns NULL.
      
      So we fix it by simply making ip6_create_rt_rcu() return ip6_null_entry
      instead of NULL.
      
      v1->v2:
        - move down 'fallback:' to make it more readable.
      
      Fixes: e873e4b9 ("ipv6: use fib6_info_hold_safe() when necessary")
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Suggested-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be092113
    • M
      gtp: change NET_UDP_TUNNEL dependency to select · 53adaacb
      Matteo Croce 提交于
      [ Upstream commit c22da36688d6298f2e546dcc43fdc1ad35036467 ]
      
      Similarly to commit a7603ac1fc8c ("geneve: change NET_UDP_TUNNEL
      dependency to select"), GTP has a dependency on NET_UDP_TUNNEL which
      makes impossible to compile it if no other protocol depending on
      NET_UDP_TUNNEL is selected.
      
      Fix this by changing the depends to a select, and drop NET_IP_TUNNEL from
      the select list, as it already depends on NET_UDP_TUNNEL.
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53adaacb
    • Y
      genetlink: Fix a memory leak on error path · 9b8ef421
      YueHaibing 提交于
      [ Upstream commit ceabee6c59943bdd5e1da1a6a20dc7ee5f8113a2 ]
      
      In genl_register_family(), when idr_alloc() fails,
      we forget to free the memory we possibly allocate for
      family->attrbuf.
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Fixes: 2ae0f17d ("genetlink: use idr to track families")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b8ef421
    • E
      dccp: do not use ipv6 header for ipv4 flow · 321461f2
      Eric Dumazet 提交于
      [ Upstream commit e0aa67709f89d08c8d8e5bdd9e0b649df61d0090 ]
      
      When a dual stack dccp listener accepts an ipv4 flow,
      it should not attempt to use an ipv6 header or
      inet6_iif() helper.
      
      Fixes: 3df80d93 ("[DCCP]: Introduce DCCPv6")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      321461f2
    • C
      ipmi_si: Fix crash when using hard-coded device · 6bba17f6
      Corey Minyard 提交于
      Backport from 41b766d661bf94a364960862cfc248a78313dbd3
      
      When excuting a command like:
        modprobe ipmi_si ports=0xffc0e3 type=bt
      The system would get an oops.
      
      The trouble here is that ipmi_si_hardcode_find_bmc() is called before
      ipmi_si_platform_init(), but initialization of the hard-coded device
      creates an IPMI platform device, which won't be initialized yet.
      
      The real trouble is that hard-coded devices aren't created with
      any device, and the fixup is done later.  So do it right, create the
      hard-coded devices as normal platform devices.
      
      This required adding some new resource types to the IPMI platform
      code for passing information required by the hard-coded device
      and adding some code to remove the hard-coded platform devices
      on module removal.
      
      To enforce the "hard-coded devices passed by the user take priority
      over firmware devices" rule, some special code was added to check
      and see if a hard-coded device already exists.
      
      The backport required some minor fixups and adding the device
      id table that had been added in another change and was used
      in this one.
      Reported-by: NYang Yingliang <yangyingliang@huawei.com>
      Cc: stable@vger.kernel.org # v4.15+
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      Tested-by: NYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6bba17f6
    • M
      Bluetooth: Verify that l2cap_get_conf_opt provides large enough buffer · 15d6538a
      Marcel Holtmann 提交于
      commit 7c9cbd0b5e38a1672fcd137894ace3b042dfbf69 upstream.
      
      The function l2cap_get_conf_opt will return L2CAP_CONF_OPT_SIZE + opt->len
      as length value. The opt->len however is in control over the remote user
      and can be used by an attacker to gain access beyond the bounds of the
      actual packet.
      
      To prevent any potential leak of heap memory, it is enough to check that
      the resulting len calculation after calling l2cap_get_conf_opt is not
      below zero. A well formed packet will always return >= 0 here and will
      end with the length value being zero after the last option has been
      parsed. In case of malformed packets messing with the opt->len field the
      length value will become negative. If that is the case, then just abort
      and ignore the option.
      
      In case an attacker uses a too short opt->len value, then garbage will
      be parsed, but that is protected by the unknown option handling and also
      the option parameter size checks.
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      15d6538a
    • M
      Bluetooth: Check L2CAP option sizes returned from l2cap_get_conf_opt · 2318c0e4
      Marcel Holtmann 提交于
      commit af3d5d1c87664a4f150fcf3534c6567cb19909b0 upstream.
      
      When doing option parsing for standard type values of 1, 2 or 4 octets,
      the value is converted directly into a variable instead of a pointer. To
      avoid being tricked into being a pointer, check that for these option
      types that sizes actually match. In L2CAP every option is fixed size and
      thus it is prudent anyway to ensure that the remote side sends us the
      right option size along with option paramters.
      
      If the option size is not matching the option type, then that option is
      silently ignored. It is a protocol violation and instead of trying to
      give the remote attacker any further hints just pretend that option is
      not present and proceed with the default values. Implementation
      following the specification and its qualification procedures will always
      use the correct size and thus not being impacted here.
      
      To keep the code readable and consistent accross all options, a few
      cosmetic changes were also required.
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2318c0e4
  2. 27 3月, 2019 19 次提交
    • G
      Linux 4.19.32 · 3a2156c8
      Greg Kroah-Hartman 提交于
      3a2156c8
    • B
      power: supply: charger-manager: Fix incorrect return value · 33bd347f
      Baolin Wang 提交于
      commit f25a646fbe2051527ad9721853e892d13a99199e upstream.
      
      Fix incorrect return value.
      Signed-off-by: NBaolin Wang <baolin.wang@linaro.org>
      Signed-off-by: NSebastian Reichel <sebastian.reichel@collabora.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33bd347f
    • H
      ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec · 19184190
      Hui Wang 提交于
      commit b5a236c175b0d984552a5f7c9d35141024c2b261 upstream.
      
      Recently we found the audio jack detection stop working after suspend
      on many machines with Realtek codec. Sometimes the audio selection
      dialogue didn't show up after users plugged headhphone/headset into
      the headset jack, sometimes after uses plugged headphone/headset, then
      click the sound icon on the upper-right corner of gnome-desktop, it
      also showed the speaker rather than the headphone.
      
      The root cause is that before suspend, the codec already call the
      runtime_suspend since this codec is not used by any apps, then in
      resume, it will not call runtime_resume for this codec. But for some
      realtek codec (so far, alc236, alc255 and alc891) with the specific
      BIOS, if it doesn't run runtime_resume after suspend, all codec
      functions including jack detection stop working anymore.
      
      This problem existed for a long time, but it was not exposed, that is
      because when problem happens, if users play sound or open
      sound-setting to check audio device, this will trigger calling to
      runtime_resume (via snd_hda_power_up), then the codec starts working
      again before users notice this problem.
      
      Since we don't know how many codec and BIOS combinations have this
      problem, to fix it, let the driver call runtime_resume for all codecs
      in pm_resume, maybe for some codecs, this is not needed, but it is
      harmless. After a codec is runtime resumed, if it is not used by any
      apps, it will be runtime suspended soon and furthermore we don't run
      suspend frequently, this change will not add much power consumption.
      
      Fixes: cc72da7d ("ALSA: hda - Use standard runtime PM for codec power-save control")
      Signed-off-by: NHui Wang <hui.wang@canonical.com>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19184190
    • T
      ALSA: hda - Record the current power state before suspend/resume calls · 156ba57f
      Takashi Iwai 提交于
      commit 98081ca62cbac31fb0f7efaf90b2e7384ce22257 upstream.
      
      Currently we deal with single codec and suspend codec callbacks for
      all S3, S4 and runtime PM handling.  But it turned out that we want
      distinguish the call patterns sometimes, e.g. for applying some init
      sequence only at probing and restoring from hibernate.
      
      This patch slightly modifies the common PM callbacks for HD-audio
      codec and stores the currently processed PM event in power_state of
      the codec's device.power field, which is currently unused.  The codec
      callback can take a look at this event value and judges which purpose
      it's being called.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      156ba57f
    • W
      locking/lockdep: Add debug_locks check in __lock_downgrade() · 0e0f7b30
      Waiman Long 提交于
      commit 71492580 upstream.
      
      Tetsuo Handa had reported he saw an incorrect "downgrading a read lock"
      warning right after a previous lockdep warning. It is likely that the
      previous warning turned off lock debugging causing the lockdep to have
      inconsistency states leading to the lock downgrade warning.
      
      Fix that by add a check for debug_locks at the beginning of
      __lock_downgrade().
      Debugged-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Reported-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Reported-by: syzbot+53383ae265fb161ef488@syzkaller.appspotmail.com
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: https://lkml.kernel.org/r/1547093005-26085-1-git-send-email-longman@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e0f7b30
    • J
      x86/unwind: Add hardcoded ORC entry for NULL · 206a76a6
      Jann Horn 提交于
      commit ac5ceccce5501e43d217c596e4ee859f2a3fef79 upstream.
      
      When the ORC unwinder is invoked for an oops caused by IP==0,
      it currently has no idea what to do because there is no debug information
      for the stack frame of NULL.
      
      But if RIP is NULL, it is very likely that the last successfully executed
      instruction was an indirect CALL/JMP, and it is possible to unwind out in
      the same way as for the first instruction of a normal function. Hardcode
      a corresponding ORC entry.
      
      With an artificially-added NULL call in prctl_set_seccomp(), before this
      patch, the trace is:
      
      Call Trace:
       ? __x64_sys_prctl+0x402/0x680
       ? __ia32_sys_prctl+0x6e0/0x6e0
       ? __do_page_fault+0x457/0x620
       ? do_syscall_64+0x6d/0x160
       ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      After this patch, the trace looks like this:
      
      Call Trace:
       __x64_sys_prctl+0x402/0x680
       ? __ia32_sys_prctl+0x6e0/0x6e0
       ? __do_page_fault+0x457/0x620
       do_syscall_64+0x6d/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      prctl_set_seccomp() still doesn't show up in the trace because for some
      reason, tail call optimization is only disabled in builds that use the
      frame pointer unwinder.
      Signed-off-by: NJann Horn <jannh@google.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michal Marek <michal.lkml@markovi.net>
      Cc: linux-kbuild@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190301031201.7416-2-jannh@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      206a76a6
    • J
      x86/unwind: Handle NULL pointer calls better in frame unwinder · 367ccafb
      Jann Horn 提交于
      commit f4f34e1b82eb4219d8eaa1c7e2e17ca219a6a2b5 upstream.
      
      When the frame unwinder is invoked for an oops caused by a call to NULL, it
      currently skips the parent function because BP still points to the parent's
      stack frame; the (nonexistent) current function only has the first half of
      a stack frame, and BP doesn't point to it yet.
      
      Add a special case for IP==0 that calculates a fake BP from SP, then uses
      the real BP for the next frame.
      
      Note that this handles first_frame specially: Return information about the
      parent function as long as the saved IP is >=first_frame, even if the fake
      BP points below it.
      
      With an artificially-added NULL call in prctl_set_seccomp(), before this
      patch, the trace is:
      
      Call Trace:
       ? prctl_set_seccomp+0x3a/0x50
       __x64_sys_prctl+0x457/0x6f0
       ? __ia32_sys_prctl+0x750/0x750
       do_syscall_64+0x72/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      After this patch, the trace is:
      
      Call Trace:
       prctl_set_seccomp+0x3a/0x50
       __x64_sys_prctl+0x457/0x6f0
       ? __ia32_sys_prctl+0x750/0x750
       do_syscall_64+0x72/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: NJann Horn <jannh@google.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michal Marek <michal.lkml@markovi.net>
      Cc: linux-kbuild@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190301031201.7416-1-jannh@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      367ccafb
    • D
      loop: access lo_backing_file only when the loop device is Lo_bound · 3254dd30
      Dongli Zhang 提交于
      commit f7c8a4120eedf24c36090b7542b179ff7a649219 upstream.
      
      Commit 758a58d0bc67 ("loop: set GENHD_FL_NO_PART_SCAN after
      blkdev_reread_part()") separates "lo->lo_backing_file = NULL" and
      "lo->lo_state = Lo_unbound" into different critical regions protected by
      loop_ctl_mutex.
      
      However, there is below race that the NULL lo->lo_backing_file would be
      accessed when the backend of a loop is another loop device, e.g., loop0's
      backend is a file, while loop1's backend is loop0.
      
      loop0's backend is file            loop1's backend is loop0
      
      __loop_clr_fd()
        mutex_lock(&loop_ctl_mutex);
        lo->lo_backing_file = NULL; --> set to NULL
        mutex_unlock(&loop_ctl_mutex);
                                         loop_set_fd()
                                           mutex_lock_killable(&loop_ctl_mutex);
                                           loop_validate_file()
                                             f = l->lo_backing_file; --> NULL
                                               access if loop0 is not Lo_unbound
        mutex_lock(&loop_ctl_mutex);
        lo->lo_state = Lo_unbound;
        mutex_unlock(&loop_ctl_mutex);
      
      lo->lo_backing_file should be accessed only when the loop device is
      Lo_bound.
      
      In fact, the problem has been introduced already in commit 7ccd0791d985
      ("loop: Push loop_ctl_mutex down into loop_clr_fd()") after which
      loop_validate_file() could see devices in Lo_rundown state with which it
      did not count. It was harmless at that point but still.
      
      Fixes: 7ccd0791d985 ("loop: Push loop_ctl_mutex down into loop_clr_fd()")
      Reported-by: syzbot+9bdc1adc1c55e7fe765b@syzkaller.appspotmail.com
      Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3254dd30
    • F
      netfilter: ebtables: remove BUGPRINT messages · 35cdcdc5
      Florian Westphal 提交于
      commit d824548dae220820bdf69b2d1561b7c4b072783f upstream.
      
      They are however frequently triggered by syzkaller, so remove them.
      
      ebtables userspace should never trigger any of these, so there is little
      value in making them pr_debug (or ratelimited).
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35cdcdc5
    • C
      f2fs: fix to avoid deadlock of atomic file operations · 1fd916e8
      Chao Yu 提交于
      commit 48432984d718c95cf13e26d487c2d1b697c3c01f upstream.
      
      Thread A				Thread B
      - __fput
       - f2fs_release_file
        - drop_inmem_pages
         - mutex_lock(&fi->inmem_lock)
         - __revoke_inmem_pages
          - lock_page(page)
      					- open
      					- f2fs_setattr
      					- truncate_setsize
      					 - truncate_inode_pages_range
      					  - lock_page(page)
      					  - truncate_cleanup_page
      					   - f2fs_invalidate_page
      					    - drop_inmem_page
      					    - mutex_lock(&fi->inmem_lock);
      
      We may encounter above ABBA deadlock as reported by Kyungtae Kim:
      
      I'm reporting a bug in linux-4.17.19: "INFO: task hung in
      drop_inmem_page" (no reproducer)
      
      I think this might be somehow related to the following:
      https://groups.google.com/forum/#!searchin/syzkaller-bugs/INFO$3A$20task$20hung$20in$20%7Csort:date/syzkaller-bugs/c6soBTrdaIo/AjAzPeIzCgAJ
      
      =========================================
      INFO: task syz-executor7:10822 blocked for more than 120 seconds.
            Not tainted 4.17.19 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor7   D27024 10822   6346 0x00000004
      Call Trace:
       context_switch kernel/sched/core.c:2867 [inline]
       __schedule+0x721/0x1e60 kernel/sched/core.c:3515
       schedule+0x88/0x1c0 kernel/sched/core.c:3559
       schedule_preempt_disabled+0x18/0x30 kernel/sched/core.c:3617
       __mutex_lock_common kernel/locking/mutex.c:833 [inline]
       __mutex_lock+0x5bd/0x1410 kernel/locking/mutex.c:893
       mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:908
       drop_inmem_page+0xcb/0x810 fs/f2fs/segment.c:327
       f2fs_invalidate_page+0x337/0x5e0 fs/f2fs/data.c:2401
       do_invalidatepage mm/truncate.c:165 [inline]
       truncate_cleanup_page+0x261/0x330 mm/truncate.c:187
       truncate_inode_pages_range+0x552/0x1610 mm/truncate.c:367
       truncate_inode_pages mm/truncate.c:478 [inline]
       truncate_pagecache+0x6d/0x90 mm/truncate.c:801
       truncate_setsize+0x81/0xa0 mm/truncate.c:826
       f2fs_setattr+0x44f/0x1270 fs/f2fs/file.c:781
       notify_change+0xa62/0xe80 fs/attr.c:313
       do_truncate+0x12e/0x1e0 fs/open.c:63
       do_last fs/namei.c:2955 [inline]
       path_openat+0x2042/0x29f0 fs/namei.c:3505
       do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
       do_sys_open+0x35e/0x4e0 fs/open.c:1101
       __do_sys_open fs/open.c:1119 [inline]
       __se_sys_open fs/open.c:1114 [inline]
       __x64_sys_open+0x89/0xc0 fs/open.c:1114
       do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4497b9
      RSP: 002b:00007f734e459c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
      RAX: ffffffffffffffda RBX: 00007f734e45a6cc RCX: 00000000004497b9
      RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
      RBP: 000000000071bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e45a700
      INFO: task syz-executor7:10858 blocked for more than 120 seconds.
            Not tainted 4.17.19 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor7   D28880 10858   6346 0x00000004
      Call Trace:
       context_switch kernel/sched/core.c:2867 [inline]
       __schedule+0x721/0x1e60 kernel/sched/core.c:3515
       schedule+0x88/0x1c0 kernel/sched/core.c:3559
       __rwsem_down_write_failed_common kernel/locking/rwsem-xadd.c:565 [inline]
       rwsem_down_write_failed+0x5e6/0xc90 kernel/locking/rwsem-xadd.c:594
       call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
       __down_write arch/x86/include/asm/rwsem.h:142 [inline]
       down_write+0x58/0xa0 kernel/locking/rwsem.c:72
       inode_lock include/linux/fs.h:713 [inline]
       do_truncate+0x120/0x1e0 fs/open.c:61
       do_last fs/namei.c:2955 [inline]
       path_openat+0x2042/0x29f0 fs/namei.c:3505
       do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
       do_sys_open+0x35e/0x4e0 fs/open.c:1101
       __do_sys_open fs/open.c:1119 [inline]
       __se_sys_open fs/open.c:1114 [inline]
       __x64_sys_open+0x89/0xc0 fs/open.c:1114
       do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4497b9
      RSP: 002b:00007f734e3b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
      RAX: ffffffffffffffda RBX: 00007f734e3b56cc RCX: 00000000004497b9
      RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
      RBP: 000000000071c238 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e3b5700
      INFO: task syz-executor5:10829 blocked for more than 120 seconds.
            Not tainted 4.17.19 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor5   D28760 10829   6308 0x80000002
      Call Trace:
       context_switch kernel/sched/core.c:2867 [inline]
       __schedule+0x721/0x1e60 kernel/sched/core.c:3515
       schedule+0x88/0x1c0 kernel/sched/core.c:3559
       io_schedule+0x21/0x80 kernel/sched/core.c:5179
       wait_on_page_bit_common mm/filemap.c:1100 [inline]
       __lock_page+0x2b5/0x390 mm/filemap.c:1273
       lock_page include/linux/pagemap.h:483 [inline]
       __revoke_inmem_pages+0xb35/0x11c0 fs/f2fs/segment.c:231
       drop_inmem_pages+0xa3/0x3e0 fs/f2fs/segment.c:306
       f2fs_release_file+0x2c7/0x330 fs/f2fs/file.c:1556
       __fput+0x2c7/0x780 fs/file_table.c:209
       ____fput+0x1a/0x20 fs/file_table.c:243
       task_work_run+0x151/0x1d0 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x8ba/0x30a0 kernel/exit.c:865
       do_group_exit+0x13b/0x3a0 kernel/exit.c:968
       get_signal+0x6bb/0x1650 kernel/signal.c:2482
       do_signal+0x84/0x1b70 arch/x86/kernel/signal.c:810
       exit_to_usermode_loop+0x155/0x190 arch/x86/entry/common.c:162
       prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
       syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
       do_syscall_64+0x445/0x4e0 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4497b9
      RSP: 002b:00007f1c68e74ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
      RAX: fffffffffffffe00 RBX: 000000000071bf80 RCX: 00000000004497b9
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000071bf80
      RBP: 000000000071bf80 R08: 0000000000000000 R09: 000000000071bf58
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000000000 R14: 00007f1c68e759c0 R15: 00007f1c68e75700
      
      This patch tries to use trylock_page to mitigate such deadlock condition
      for fix.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1fd916e8
    • M
      RDMA/cma: Rollback source IP address if failing to acquire device · 9dd5053c
      Myungho Jung 提交于
      commit 5fc01fb846bce8fa6d5f95e2625b8ce0f8e86810 upstream.
      
      If cma_acquire_dev_by_src_ip() returns error in addr_handler(), the
      device state changes back to RDMA_CM_ADDR_BOUND but the resolved source
      IP address is still left. After that, if rdma_destroy_id() is called
      after rdma_listen(), the device is freed without removed from
      listen_any_list in cma_cancel_operation(). Revert to the previous IP
      address if acquiring device fails.
      
      Reported-by: syzbot+f3ce716af730c8f96637@syzkaller.appspotmail.com
      Signed-off-by: NMyungho Jung <mhjungk@gmail.com>
      Reviewed-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9dd5053c
    • C
      drm: Reorder set_property_atomic to avoid returning with an active ww_ctx · 015b828b
      Chris Wilson 提交于
      commit 227ad6d957898a88b1746e30234ece64d305f066 upstream.
      
      Delay the drm_modeset_acquire_init() until after we check for an
      allocation failure so that we can return immediately upon error without
      having to unwind.
      
      WARNING: lock held when returning to user space!
      4.20.0+ #174 Not tainted
      ------------------------------------------------
      syz-executor556/8153 is leaving the kernel with locks still held!
      1 lock held by syz-executor556/8153:
        #0: 000000005100c85c (crtc_ww_class_acquire){+.+.}, at:
      set_property_atomic+0xb3/0x330 drivers/gpu/drm/drm_mode_object.c:462
      
      Reported-by: syzbot+6ea337c427f5083ebdf2@syzkaller.appspotmail.com
      Fixes: 144a7999 ("drm: Handle properties in the core for atomic drivers")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Sean Paul <sean@poorly.run>
      Cc: David Airlie <airlied@linux.ie>
      Cc: <stable@vger.kernel.org> # v4.14+
      Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181230122842.21917-1-chris@chris-wilson.co.ukSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      015b828b
    • K
      Bluetooth: hci_ldisc: Postpone HCI_UART_PROTO_READY bit set in hci_uart_set_proto() · e365b940
      Kefeng Wang 提交于
      commit 56897b217a1d0a91c9920cb418d6b3fe922f590a upstream.
      
      task A:                                task B:
      hci_uart_set_proto                     flush_to_ldisc
       - p->open(hu) -> h5_open  //alloc h5  - receive_buf
       - set_bit HCI_UART_PROTO_READY         - tty_port_default_receive_buf
       - hci_uart_register_dev                 - tty_ldisc_receive_buf
                                                - hci_uart_tty_receive
      				           - test_bit HCI_UART_PROTO_READY
      				            - h5_recv
       - clear_bit HCI_UART_PROTO_READY             while() {
       - p->open(hu) -> h5_close //free h5
      				              - h5_rx_3wire_hdr
      				               - h5_reset()  //use-after-free
                                                    }
      
      It could use ioctl to set hci uart proto, but there is
      a use-after-free issue when hci_uart_register_dev() fail in
      hci_uart_set_proto(), see stack above, fix this by setting
      HCI_UART_PROTO_READY bit only when hci_uart_register_dev()
      return success.
      
      Reported-by: syzbot+899a33dc0fa0dbaf06a6@syzkaller.appspotmail.com
      Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
      Reviewed-by: NJeremy Cline <jcline@redhat.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e365b940
    • J
      Bluetooth: hci_ldisc: Initialize hci_dev before open() · f67202f7
      Jeremy Cline 提交于
      commit 32a7b4cbe93b0a0ef7e63d31ca69ce54736c4412 upstream.
      
      The hci_dev struct hdev is referenced in work queues and timers started
      by open() in some protocols. This creates a race between the
      initialization function and the work or timer which can result hdev
      being dereferenced while it is still null.
      
      The syzbot report contains a reliable reproducer which causes a null
      pointer dereference of hdev in hci_uart_write_work() by making the
      memory allocation for hdev fail.
      
      To fix this, ensure hdev is valid from before calling a protocol's
      open() until after calling a protocol's close().
      
      Reported-by: syzbot+257790c15bcdef6fe00c@syzkaller.appspotmail.com
      Signed-off-by: NJeremy Cline <jcline@redhat.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f67202f7
    • M
      Bluetooth: Fix decrementing reference count twice in releasing socket · 4b390513
      Myungho Jung 提交于
      commit e20a2e9c42c9e4002d9e338d74e7819e88d77162 upstream.
      
      When releasing socket, it is possible to enter hci_sock_release() and
      hci_sock_dev_event(HCI_DEV_UNREG) at the same time in different thread.
      The reference count of hdev should be decremented only once from one of
      them but if storing hdev to local variable in hci_sock_release() before
      detached from socket and setting to NULL in hci_sock_dev_event(),
      hci_dev_put(hdev) is unexpectedly called twice. This is resolved by
      referencing hdev from socket after bt_sock_unlink() in
      hci_sock_release().
      
      Reported-by: syzbot+fdc00003f4efff43bc5b@syzkaller.appspotmail.com
      Signed-off-by: NMyungho Jung <mhjungk@gmail.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b390513
    • M
      Bluetooth: hci_uart: Check if socket buffer is ERR_PTR in h4_recv_buf() · 4e0ca4bf
      Myungho Jung 提交于
      commit 1dc2d785156cbdc80806c32e8d2c7c735d0b4721 upstream.
      
      h4_recv_buf() callers store the return value to socket buffer and
      recursively pass the buffer to h4_recv_buf() without protection. So,
      ERR_PTR returned from h4_recv_buf() can be dereferenced, if called again
      before setting the socket buffer to NULL from previous error. Check if
      skb is ERR_PTR in h4_recv_buf().
      
      Reported-by: syzbot+017a32f149406df32703@syzkaller.appspotmail.com
      Signed-off-by: NMyungho Jung <mhjungk@gmail.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e0ca4bf
    • H
      media: v4l2-ctrls.c/uvc: zero v4l2_event · 6bef442e
      Hans Verkuil 提交于
      commit f45f3f753b0a3d739acda8e311b4f744d82dc52a upstream.
      
      Control events can leak kernel memory since they do not fully zero the
      event. The same code is present in both v4l2-ctrls.c and uvc_ctrl.c, so
      fix both.
      
      It appears that all other event code is properly zeroing the structure,
      it's these two places.
      Signed-off-by: NHans Verkuil <hverkuil-cisco@xs4all.nl>
      Reported-by: syzbot+4f021cf3697781dbd9fb@syzkaller.appspotmail.com
      Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6bef442e
    • Z
      ext4: brelse all indirect buffer in ext4_ind_remove_space() · d12d8641
      zhangyi (F) 提交于
      commit 674a2b27234d1b7afcb0a9162e81b2e53aeef217 upstream.
      
      All indirect buffers get by ext4_find_shared() should be released no
      mater the branch should be freed or not. But now, we forget to release
      the lower depth indirect buffers when removing space from the same
      higher depth indirect block. It will lead to buffer leak and futher
      more, it may lead to quota information corruption when using old quota,
      consider the following case.
      
       - Create and mount an empty ext4 filesystem without extent and quota
         features,
       - quotacheck and enable the user & group quota,
       - Create some files and write some data to them, and then punch hole
         to some files of them, it may trigger the buffer leak problem
         mentioned above.
       - Disable quota and run quotacheck again, it will create two new
         aquota files and write the checked quota information to them, which
         probably may reuse the freed indirect block(the buffer and page
         cache was not freed) as data block.
       - Enable quota again, it will invoke
         vfs_load_quota_inode()->invalidate_bdev() to try to clean unused
         buffers and pagecache. Unfortunately, because of the buffer of quota
         data block is still referenced, quota code cannot read the up to date
         quota info from the device and lead to quota information corruption.
      
      This problem can be reproduced by xfstests generic/231 on ext3 file
      system or ext4 file system without extent and quota features.
      
      This patch fix this problem by releasing the missing indirect buffers,
      in ext4_ind_remove_space().
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: Nzhangyi (F) <yi.zhang@huawei.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Cc: stable@kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d12d8641
    • L
      ext4: fix data corruption caused by unaligned direct AIO · 76c9ee6b
      Lukas Czerner 提交于
      commit 372a03e01853f860560eade508794dd274e9b390 upstream.
      
      Ext4 needs to serialize unaligned direct AIO because the zeroing of
      partial blocks of two competing unaligned AIOs can result in data
      corruption.
      
      However it decides not to serialize if the potentially unaligned aio is
      past i_size with the rationale that no pending writes are possible past
      i_size. Unfortunately if the i_size is not block aligned and the second
      unaligned write lands past i_size, but still into the same block, it has
      the potential of corrupting the previous unaligned write to the same
      block.
      
      This is (very simplified) reproducer from Frank
      
          // 41472 = (10 * 4096) + 512
          // 37376 = 41472 - 4096
      
          ftruncate(fd, 41472);
          io_prep_pwrite(iocbs[0], fd, buf[0], 4096, 37376);
          io_prep_pwrite(iocbs[1], fd, buf[1], 4096, 41472);
      
          io_submit(io_ctx, 1, &iocbs[1]);
          io_submit(io_ctx, 1, &iocbs[2]);
      
          io_getevents(io_ctx, 2, 2, events, NULL);
      
      Without this patch the 512B range from 40960 up to the start of the
      second unaligned write (41472) is going to be zeroed overwriting the data
      written by the first write. This is a data corruption.
      
      00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
      *
      00009200  30 30 30 30 30 30 30 30  30 30 30 30 30 30 30 30
      *
      0000a000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
      *
      0000a200  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31
      
      With this patch the data corruption is avoided because we will recognize
      the unaligned_aio and wait for the unwritten extent conversion.
      
      00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
      *
      00009200  30 30 30 30 30 30 30 30  30 30 30 30 30 30 30 30
      *
      0000a200  31 31 31 31 31 31 31 31  31 31 31 31 31 31 31 31
      *
      0000b200
      Reported-by: NFrank Sorenson <fsorenso@redhat.com>
      Signed-off-by: NLukas Czerner <lczerner@redhat.com>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Fixes: e9e3bcec ("ext4: serialize unaligned asynchronous DIO")
      Cc: stable@vger.kernel.org
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76c9ee6b