1. 03 4月, 2019 26 次提交
    • S
      vrf: prevent adding upper devices · 3b1386be
      Sabrina Dubroca 提交于
      [ Upstream commit 1017e0987117c32783ba7c10fe2e7ff1456ba1dc ]
      
      VRF devices don't work with upper devices. Currently, it's possible to
      add a VRF device to a bridge or team, and to create macvlan, macsec, or
      ipvlan devices on top of a VRF (bond and vlan are prevented respectively
      by the lack of an ndo_set_mac_address op and the NETIF_F_VLAN_CHALLENGED
      feature flag).
      
      Fix this by setting the IFF_NO_RX_HANDLER flag (introduced in commit
      f5426250 ("net: introduce IFF_NO_RX_HANDLER")).
      
      Cc: David Ahern <dsahern@gmail.com>
      Fixes: 193125db ("net: Introduce VRF device driver")
      Signed-off-by: NSabrina Dubroca <sd@queasysnail.net>
      Acked-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3b1386be
    • E
      tun: properly test for IFF_UP · 8ea78da1
      Eric Dumazet 提交于
      [ Upstream commit 4477138fa0ae4e1b699786ef0600863ea6e6c61c ]
      
      Same reasons than the ones explained in commit 4179cb5a4c92
      ("vxlan: test dev->flags & IFF_UP before calling netif_rx()")
      
      netif_rx_ni() or napi_gro_frags() must be called under a strict contract.
      
      At device dismantle phase, core networking clears IFF_UP
      and flush_all_backlogs() is called after rcu grace period
      to make sure no incoming packet might be in a cpu backlog
      and still referencing the device.
      
      A similar protocol is used for gro layer.
      
      Most drivers call netif_rx() from their interrupt handler,
      and since the interrupts are disabled at device dismantle,
      netif_rx() does not have to check dev->flags & IFF_UP
      
      Virtual drivers do not have this guarantee, and must
      therefore make the check themselves.
      
      Fixes: 1bd4978a ("tun: honor IFF_UP in tun_get_user()")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8ea78da1
    • E
      tipc: fix cancellation of topology subscriptions · 52a7505c
      Erik Hugne 提交于
      [ Upstream commit 33872d79f5d1cbedaaab79669cc38f16097a9450 ]
      
      When cancelling a subscription, we have to clear the cancel bit in the
      request before iterating over any established subscriptions with memcmp.
      Otherwise no subscription will ever be found, and it will not be
      possible to explicitly unsubscribe individual subscriptions.
      
      Fixes: 8985ecc7 ("tipc: simplify endianness handling in topology subscriber")
      Signed-off-by: NErik Hugne <erik.hugne@gmail.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      52a7505c
    • X
      tipc: change to check tipc_own_id to return in tipc_net_stop · 1be6c0c7
      Xin Long 提交于
      [ Upstream commit 9926cb5f8b0f0aea535735185600d74db7608550 ]
      
      When running a syz script, a panic occurred:
      
      [  156.088228] BUG: KASAN: use-after-free in tipc_disc_timeout+0x9c9/0xb20 [tipc]
      [  156.094315] Call Trace:
      [  156.094844]  <IRQ>
      [  156.095306]  dump_stack+0x7c/0xc0
      [  156.097346]  print_address_description+0x65/0x22e
      [  156.100445]  kasan_report.cold.3+0x37/0x7a
      [  156.102402]  tipc_disc_timeout+0x9c9/0xb20 [tipc]
      [  156.106517]  call_timer_fn+0x19a/0x610
      [  156.112749]  run_timer_softirq+0xb51/0x1090
      
      It was caused by the netns freed without deleting the discoverer timer,
      while later on the netns would be accessed in the timer handler.
      
      The timer should have been deleted by tipc_net_stop() when cleaning up a
      netns. However, tipc has been able to enable a bearer and start d->timer
      without the local node_addr set since Commit 52dfae5c ("tipc: obtain
      node identity from interface by default"), which caused the timer not to
      be deleted in tipc_net_stop() then.
      
      So fix it in tipc_net_stop() by changing to check local node_id instead
      of local node_addr, as Jon suggested.
      
      While at it, remove the calling of tipc_nametbl_withdraw() there, since
      tipc_nametbl_stop() will take of the nametbl's freeing after.
      
      Fixes: 52dfae5c ("tipc: obtain node identity from interface by default")
      Reported-by: syzbot+a25307ad099309f1c2b9@syzkaller.appspotmail.com
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NYing Xue <ying.xue@windriver.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1be6c0c7
    • E
      tipc: allow service ranges to be connect()'ed on RDM/DGRAM · 24d1a625
      Erik Hugne 提交于
      [ Upstream commit ea239314fe42ace880bdd834256834679346c80e ]
      
      We move the check that prevents connecting service ranges to after
      the RDM/DGRAM check, and move address sanity control to a separate
      function that also validates the service range.
      
      Fixes: 23998835 ("tipc: improve address sanity check in tipc_connect()")
      Signed-off-by: NErik Hugne <erik.hugne@gmail.com>
      Signed-off-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      24d1a625
    • E
      tcp: do not use ipv6 header for ipv4 flow · 7115df61
      Eric Dumazet 提交于
      [ Upstream commit 89e4130939a20304f4059ab72179da81f5347528 ]
      
      When a dual stack tcp listener accepts an ipv4 flow,
      it should not attempt to use an ipv6 header or tcp_v6_iif() helper.
      
      Fixes: 1397ed35 ("ipv6: add flowinfo for tcp6 pkt_options for all cases")
      Fixes: df3687ff ("ipv6: add the IPV6_FL_F_REFLECT flag to IPV6_FL_A_GET")
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7115df61
    • X
      sctp: use memdup_user instead of vmemdup_user · cab576f1
      Xin Long 提交于
      [ Upstream commit ef82bcfa671b9a635bab5fa669005663d8b177c5 ]
      
      In sctp_setsockopt_bindx()/__sctp_setsockopt_connectx(), it allocates
      memory with addrs_size which is passed from userspace. We used flag
      GFP_USER to put some more restrictions on it in Commit cacc0621
      ("sctp: use GFP_USER for user-controlled kmalloc").
      
      However, since Commit c981f254 ("sctp: use vmemdup_user() rather
      than badly open-coding memdup_user()"), vmemdup_user() has been used,
      which doesn't check GFP_USER flag when goes to vmalloc_*(). So when
      addrs_size is a huge value, it could exhaust memory and even trigger
      oom killer.
      
      This patch is to use memdup_user() instead, in which GFP_USER would
      work to limit the memory allocation with a huge addrs_size.
      
      Note we can't fix it by limiting 'addrs_size', as there's no demand
      for it from RFC.
      
      Reported-by: syzbot+ec1b7575afef85a0e5ca@syzkaller.appspotmail.com
      Fixes: c981f254 ("sctp: use vmemdup_user() rather than badly open-coding memdup_user()")
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cab576f1
    • X
      sctp: get sctphdr by offset in sctp_compute_cksum · 97265479
      Xin Long 提交于
      [ Upstream commit 273160ffc6b993c7c91627f5a84799c66dfe4dee ]
      
      sctp_hdr(skb) only works when skb->transport_header is set properly.
      
      But in Netfilter, skb->transport_header for ipv6 is not guaranteed
      to be right value for sctphdr. It would cause to fail to check the
      checksum for sctp packets.
      
      So fix it by using offset, which is always right in all places.
      
      v1->v2:
        - Fix the changelog.
      
      Fixes: e6d8b64b ("net: sctp: fix and consolidate SCTP checksumming code")
      Reported-by: NLi Shuang <shuali@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97265479
    • H
      rhashtable: Still do rehash when we get EEXIST · cf86f7a9
      Herbert Xu 提交于
      [ Upstream commit 408f13ef358aa5ad56dc6230c2c7deb92cf462b1 ]
      
      As it stands if a shrink is delayed because of an outstanding
      rehash, we will go into a rescheduling loop without ever doing
      the rehash.
      
      This patch fixes this by still carrying out the rehash and then
      rescheduling so that we can shrink after the completion of the
      rehash should it still be necessary.
      
      The return value of EEXIST captures this case and other cases
      (e.g., another thread expanded/rehashed the table at the same
      time) where we should still proceed with the rehash.
      
      Fixes: da20420f ("rhashtable: Add nested tables")
      Reported-by: NJosh Elsasser <jelsasser@appneta.com>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Tested-by: NJosh Elsasser <jelsasser@appneta.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf86f7a9
    • M
      packets: Always register packet sk in the same order · 69cea7cf
      Maxime Chevallier 提交于
      [ Upstream commit a4dc6a49156b1f8d6e17251ffda17c9e6a5db78a ]
      
      When using fanouts with AF_PACKET, the demux functions such as
      fanout_demux_cpu will return an index in the fanout socket array, which
      corresponds to the selected socket.
      
      The ordering of this array depends on the order the sockets were added
      to a given fanout group, so for FANOUT_CPU this means sockets are bound
      to cpus in the order they are configured, which is OK.
      
      However, when stopping then restarting the interface these sockets are
      bound to, the sockets are reassigned to the fanout group in the reverse
      order, due to the fact that they were inserted at the head of the
      interface's AF_PACKET socket list.
      
      This means that traffic that was directed to the first socket in the
      fanout group is now directed to the last one after an interface restart.
      
      In the case of FANOUT_CPU, traffic from CPU0 will be directed to the
      socket that used to receive traffic from the last CPU after an interface
      restart.
      
      This commit introduces a helper to add a socket at the tail of a list,
      then uses it to register AF_PACKET sockets.
      
      Note that this changes the order in which sockets are listed in /proc and
      with sock_diag.
      
      Fixes: dc99f600 ("packet: Add fanout support")
      Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
      Acked-by: NWillem de Bruijn <willemb@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69cea7cf
    • Y
      net-sysfs: call dev_hold if kobject_init_and_add success · d9d215be
      YueHaibing 提交于
      [ Upstream commit a3e23f719f5c4a38ffb3d30c8d7632a4ed8ccd9e ]
      
      In netdev_queue_add_kobject and rx_queue_add_kobject,
      if sysfs_create_group failed, kobject_put will call
      netdev_queue_release to decrease dev refcont, however
      dev_hold has not be called. So we will see this while
      unregistering dev:
      
      unregister_netdevice: waiting for bcsh0 to become free. Usage count = -1
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Fixes: d0d66837 ("net: don't decrement kobj reference count on init failure")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9d215be
    • A
      net: stmmac: fix memory corruption with large MTUs · 8dcf078d
      Aaro Koskinen 提交于
      [ Upstream commit 223a960c01227e4dbcb6f9fa06b47d73bda21274 ]
      
      When using 16K DMA buffers and ring mode, the DES3 refill is not working
      correctly as the function is using a bogus pointer for checking the
      private data. As a result stale pointers will remain in the RX descriptor
      ring, so DMA will now likely overwrite/corrupt some already freed memory.
      
      As simple reproducer, just receive some UDP traffic:
      
      	# ifconfig eth0 down; ifconfig eth0 mtu 9000; ifconfig eth0 up
      	# iperf3 -c 192.168.253.40 -u -b 0 -R
      
      If you didn't crash by now check the RX descriptors to find non-contiguous
      RX buffers:
      
      	cat /sys/kernel/debug/stmmaceth/eth0/descriptors_status
      	[...]
      	1 [0x2be5020]: 0xa3220321 0x9ffc1ffc 0x72d70082 0x130e207e
      					     ^^^^^^^^^^^^^^^^^^^^^
      	2 [0x2be5040]: 0xa3220321 0x9ffc1ffc 0x72998082 0x1311a07e
      					     ^^^^^^^^^^^^^^^^^^^^^
      
      A simple ping test will now report bad data:
      
      	# ping -s 8200 192.168.253.40
      	PING 192.168.253.40 (192.168.253.40) 8200(8228) bytes of data.
      	8208 bytes from 192.168.253.40: icmp_seq=1 ttl=64 time=1.00 ms
      	wrong data byte #8144 should be 0xd0 but was 0x88
      
      Fix the wrong pointer. Also we must refill DES3 only if the DMA buffer
      size is 16K.
      
      Fixes: 54139cf3 ("net: stmmac: adding multiple buffers for rx")
      Signed-off-by: NAaro Koskinen <aaro.koskinen@nokia.com>
      Acked-by: NJose Abreu <joabreu@synopsys.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dcf078d
    • E
      net: rose: fix a possible stack overflow · 7eeb12ed
      Eric Dumazet 提交于
      [ Upstream commit e5dcc0c3223c45c94100f05f28d8ef814db3d82c ]
      
      rose_write_internal() uses a temp buffer of 100 bytes, but a manual
      inspection showed that given arbitrary input, rose_create_facilities()
      can fill up to 110 bytes.
      
      Lets use a tailroom of 256 bytes for peace of mind, and remove
      the bounce buffer : we can simply allocate a big enough skb
      and adjust its length as needed.
      
      syzbot report :
      
      BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:352 [inline]
      BUG: KASAN: stack-out-of-bounds in rose_create_facilities net/rose/rose_subr.c:521 [inline]
      BUG: KASAN: stack-out-of-bounds in rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116
      Write of size 7 at addr ffff88808b1ffbef by task syz-executor.0/24854
      
      CPU: 0 PID: 24854 Comm: syz-executor.0 Not tainted 5.0.0+ #97
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
       kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       check_memory_region_inline mm/kasan/generic.c:185 [inline]
       check_memory_region+0x123/0x190 mm/kasan/generic.c:191
       memcpy+0x38/0x50 mm/kasan/common.c:131
       memcpy include/linux/string.h:352 [inline]
       rose_create_facilities net/rose/rose_subr.c:521 [inline]
       rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116
       rose_connect+0x7cb/0x1510 net/rose/af_rose.c:826
       __sys_connect+0x266/0x330 net/socket.c:1685
       __do_sys_connect net/socket.c:1696 [inline]
       __se_sys_connect net/socket.c:1693 [inline]
       __x64_sys_connect+0x73/0xb0 net/socket.c:1693
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x458079
      Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f47b8d9dc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458079
      RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000004
      RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007f47b8d9e6d4
      R13: 00000000004be4a4 R14: 00000000004ceca8 R15: 00000000ffffffff
      
      The buggy address belongs to the page:
      page:ffffea00022c7fc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
      flags: 0x1fffc0000000000()
      raw: 01fffc0000000000 0000000000000000 ffffffff022c0101 0000000000000000
      raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88808b1ffa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff88808b1ffb00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 03
      >ffff88808b1ffb80: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 04 f3
                                                                   ^
       ffff88808b1ffc00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
       ffff88808b1ffc80: 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 01 f2 01
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7eeb12ed
    • J
      net: phy: meson-gxl: fix interrupt support · a6f0168e
      Jerome Brunet 提交于
      [ Upstream commit daa5c4d0167a308306525fd5ab9a5e18e21f4f74 ]
      
      If an interrupt is already pending when the interrupt is enabled on the
      GXL phy, no IRQ will ever be triggered.
      
      The fix is simply to make sure pending IRQs are cleared before setting
      up the irq mask.
      
      Fixes: cf127ff2 ("net: phy: meson-gxl: add interrupt support")
      Signed-off-by: NJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6f0168e
    • C
      net/packet: Set __GFP_NOWARN upon allocation in alloc_pg_vec · 85ef72d8
      Christoph Paasch 提交于
      [ Upstream commit 398f0132c14754fcd03c1c4f8e7176d001ce8ea1 ]
      
      Since commit fc62814d690c ("net/packet: fix 4gb buffer limit due to overflow check")
      one can now allocate packet ring buffers >= UINT_MAX. However, syzkaller
      found that that triggers a warning:
      
      [   21.100000] WARNING: CPU: 2 PID: 2075 at mm/page_alloc.c:4584 __alloc_pages_nod0
      [   21.101490] Modules linked in:
      [   21.101921] CPU: 2 PID: 2075 Comm: syz-executor.0 Not tainted 5.0.0 #146
      [   21.102784] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
      [   21.103887] RIP: 0010:__alloc_pages_nodemask+0x2a0/0x630
      [   21.104640] Code: fe ff ff 65 48 8b 04 25 c0 de 01 00 48 05 90 0f 00 00 41 bd 01 00 00 00 48 89 44 24 48 e9 9c fe 3
      [   21.107121] RSP: 0018:ffff88805e1cf920 EFLAGS: 00010246
      [   21.107819] RAX: 0000000000000000 RBX: ffffffff85a488a0 RCX: 0000000000000000
      [   21.108753] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000000
      [   21.109699] RBP: 1ffff1100bc39f28 R08: ffffed100bcefb67 R09: ffffed100bcefb67
      [   21.110646] R10: 0000000000000001 R11: ffffed100bcefb66 R12: 000000000000000d
      [   21.111623] R13: 0000000000000000 R14: ffff88805e77d888 R15: 000000000000000d
      [   21.112552] FS:  00007f7c7de05700(0000) GS:ffff88806d100000(0000) knlGS:0000000000000000
      [   21.113612] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   21.114405] CR2: 000000000065c000 CR3: 000000005e58e006 CR4: 00000000001606e0
      [   21.115367] Call Trace:
      [   21.115705]  ? __alloc_pages_slowpath+0x21c0/0x21c0
      [   21.116362]  alloc_pages_current+0xac/0x1e0
      [   21.116923]  kmalloc_order+0x18/0x70
      [   21.117393]  kmalloc_order_trace+0x18/0x110
      [   21.117949]  packet_set_ring+0x9d5/0x1770
      [   21.118524]  ? packet_rcv_spkt+0x440/0x440
      [   21.119094]  ? lock_downgrade+0x620/0x620
      [   21.119646]  ? __might_fault+0x177/0x1b0
      [   21.120177]  packet_setsockopt+0x981/0x2940
      [   21.120753]  ? __fget+0x2fb/0x4b0
      [   21.121209]  ? packet_release+0xab0/0xab0
      [   21.121740]  ? sock_has_perm+0x1cd/0x260
      [   21.122297]  ? selinux_secmark_relabel_packet+0xd0/0xd0
      [   21.123013]  ? __fget+0x324/0x4b0
      [   21.123451]  ? selinux_netlbl_socket_setsockopt+0x101/0x320
      [   21.124186]  ? selinux_netlbl_sock_rcv_skb+0x3a0/0x3a0
      [   21.124908]  ? __lock_acquire+0x529/0x3200
      [   21.125453]  ? selinux_socket_setsockopt+0x5d/0x70
      [   21.126075]  ? __sys_setsockopt+0x131/0x210
      [   21.126533]  ? packet_release+0xab0/0xab0
      [   21.127004]  __sys_setsockopt+0x131/0x210
      [   21.127449]  ? kernel_accept+0x2f0/0x2f0
      [   21.127911]  ? ret_from_fork+0x8/0x50
      [   21.128313]  ? do_raw_spin_lock+0x11b/0x280
      [   21.128800]  __x64_sys_setsockopt+0xba/0x150
      [   21.129271]  ? lockdep_hardirqs_on+0x37f/0x560
      [   21.129769]  do_syscall_64+0x9f/0x450
      [   21.130182]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      We should allocate with __GFP_NOWARN to handle this.
      
      Cc: Kal Conley <kal.conley@dectris.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Fixes: fc62814d690c ("net/packet: fix 4gb buffer limit due to overflow check")
      Signed-off-by: NChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85ef72d8
    • P
      net: datagram: fix unbounded loop in __skb_try_recv_datagram() · 88c64f9c
      Paolo Abeni 提交于
      [ Upstream commit 0b91bce1ebfc797ff3de60c8f4a1e6219a8a3187 ]
      
      Christoph reported a stall while peeking datagram with an offset when
      busy polling is enabled. __skb_try_recv_datagram() uses as the loop
      termination condition 'queue empty'. When peeking, the socket
      queue can be not empty, even when no additional packets are received.
      
      Address the issue explicitly checking for receive queue changes,
      as currently done by __skb_wait_for_more_packets().
      
      Fixes: 2b5cd0df ("net: Change return type of sk_busy_loop from bool to void")
      Reported-and-tested-by: NChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      88c64f9c
    • D
      net: aquantia: fix rx checksum offload for UDP/TCP over IPv6 · e4ff39e1
      Dmitry Bogdanov 提交于
      [ Upstream commit a7faaa0c5dc7d091cc9f72b870d7edcdd6f43f12 ]
      
      TCP/UDP checksum validity was propagated to skb
      only if IP checksum is valid.
      But for IPv6 there is no validity as there is no checksum in IPv6.
      This patch propagates TCP/UDP checksum validity regardless of IP checksum.
      
      Fixes: 018423e9 ("net: ethernet: aquantia: Add ring support code")
      Signed-off-by: NIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: NNikita Danilov <nikita.danilov@aquantia.com>
      Signed-off-by: NDmitry Bogdanov <dmitry.bogdanov@aquantia.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e4ff39e1
    • B
      mISDN: hfcpci: Test both vendor & device ID for Digium HFC4S · c4084262
      Bjorn Helgaas 提交于
      [ Upstream commit fae846e2b7124d4b076ef17791c73addf3b26350 ]
      
      The device ID alone does not uniquely identify a device.  Test both the
      vendor and device ID to make sure we don't mistakenly think some other
      vendor's 0xB410 device is a Digium HFC4S.  Also, instead of the bare hex
      ID, use the same constant (PCI_DEVICE_ID_DIGIUM_HFC4S) used in the device
      ID table.
      
      No functional change intended.
      Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c4084262
    • F
      mac8390: Fix mmio access size probe · e0f8c06f
      Finn Thain 提交于
      [ Upstream commit bb9e5c5bcd76f4474eac3baf643d7a39f7bac7bb ]
      
      The bug that Stan reported is as follows. After a restart, a 16-bit NIC
      may be incorrectly identified as a 32-bit NIC and stop working.
      
      mac8390 slot.E: Memory length resource not found, probing
      mac8390 slot.E: Farallon EtherMac II-C (type farallon)
      mac8390 slot.E: MAC 00:00:c5:30:c2:99, IRQ 61, 32 KB shared memory at 0xfeed0000, 32-bit access.
      
      The bug never arises after a cold start and only intermittently after a
      warm start. (I didn't investigate why the bug is intermittent.)
      
      It turns out that memcpy_toio() is deprecated and memcmp_withio() also
      has issues. Replacing these calls with mmio accessors fixes the problem.
      Reported-and-tested-by: NStan Johnson <userm57@yahoo.com>
      Fixes: 2964db0f ("m68k: Mac DP8390 update")
      Signed-off-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0f8c06f
    • X
      ipv6: make ip6_create_rt_rcu return ip6_null_entry instead of NULL · be092113
      Xin Long 提交于
      [ Upstream commit 1c87e79a002f6a159396138cd3f3ab554a2a8887 ]
      
      Jianlin reported a crash:
      
        [  381.484332] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
        [  381.619802] RIP: 0010:fib6_rule_lookup+0xa3/0x160
        [  382.009615] Call Trace:
        [  382.020762]  <IRQ>
        [  382.030174]  ip6_route_redirect.isra.52+0xc9/0xf0
        [  382.050984]  ip6_redirect+0xb6/0xf0
        [  382.066731]  icmpv6_notify+0xca/0x190
        [  382.083185]  ndisc_redirect_rcv+0x10f/0x160
        [  382.102569]  ndisc_rcv+0xfb/0x100
        [  382.117725]  icmpv6_rcv+0x3f2/0x520
        [  382.133637]  ip6_input_finish+0xbf/0x460
        [  382.151634]  ip6_input+0x3b/0xb0
        [  382.166097]  ipv6_rcv+0x378/0x4e0
      
      It was caused by the lookup function __ip6_route_redirect() returns NULL in
      fib6_rule_lookup() when ip6_create_rt_rcu() returns NULL.
      
      So we fix it by simply making ip6_create_rt_rcu() return ip6_null_entry
      instead of NULL.
      
      v1->v2:
        - move down 'fallback:' to make it more readable.
      
      Fixes: e873e4b9 ("ipv6: use fib6_info_hold_safe() when necessary")
      Reported-by: NJianlin Shi <jishi@redhat.com>
      Suggested-by: NPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Reviewed-by: NDavid Ahern <dsahern@gmail.com>
      Acked-by: NWei Wang <weiwan@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be092113
    • M
      gtp: change NET_UDP_TUNNEL dependency to select · 53adaacb
      Matteo Croce 提交于
      [ Upstream commit c22da36688d6298f2e546dcc43fdc1ad35036467 ]
      
      Similarly to commit a7603ac1fc8c ("geneve: change NET_UDP_TUNNEL
      dependency to select"), GTP has a dependency on NET_UDP_TUNNEL which
      makes impossible to compile it if no other protocol depending on
      NET_UDP_TUNNEL is selected.
      
      Fix this by changing the depends to a select, and drop NET_IP_TUNNEL from
      the select list, as it already depends on NET_UDP_TUNNEL.
      Signed-off-by: NMatteo Croce <mcroce@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53adaacb
    • Y
      genetlink: Fix a memory leak on error path · 9b8ef421
      YueHaibing 提交于
      [ Upstream commit ceabee6c59943bdd5e1da1a6a20dc7ee5f8113a2 ]
      
      In genl_register_family(), when idr_alloc() fails,
      we forget to free the memory we possibly allocate for
      family->attrbuf.
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Fixes: 2ae0f17d ("genetlink: use idr to track families")
      Signed-off-by: NYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: NKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b8ef421
    • E
      dccp: do not use ipv6 header for ipv4 flow · 321461f2
      Eric Dumazet 提交于
      [ Upstream commit e0aa67709f89d08c8d8e5bdd9e0b649df61d0090 ]
      
      When a dual stack dccp listener accepts an ipv4 flow,
      it should not attempt to use an ipv6 header or
      inet6_iif() helper.
      
      Fixes: 3df80d93 ("[DCCP]: Introduce DCCPv6")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      321461f2
    • C
      ipmi_si: Fix crash when using hard-coded device · 6bba17f6
      Corey Minyard 提交于
      Backport from 41b766d661bf94a364960862cfc248a78313dbd3
      
      When excuting a command like:
        modprobe ipmi_si ports=0xffc0e3 type=bt
      The system would get an oops.
      
      The trouble here is that ipmi_si_hardcode_find_bmc() is called before
      ipmi_si_platform_init(), but initialization of the hard-coded device
      creates an IPMI platform device, which won't be initialized yet.
      
      The real trouble is that hard-coded devices aren't created with
      any device, and the fixup is done later.  So do it right, create the
      hard-coded devices as normal platform devices.
      
      This required adding some new resource types to the IPMI platform
      code for passing information required by the hard-coded device
      and adding some code to remove the hard-coded platform devices
      on module removal.
      
      To enforce the "hard-coded devices passed by the user take priority
      over firmware devices" rule, some special code was added to check
      and see if a hard-coded device already exists.
      
      The backport required some minor fixups and adding the device
      id table that had been added in another change and was used
      in this one.
      Reported-by: NYang Yingliang <yangyingliang@huawei.com>
      Cc: stable@vger.kernel.org # v4.15+
      Signed-off-by: NCorey Minyard <cminyard@mvista.com>
      Tested-by: NYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6bba17f6
    • M
      Bluetooth: Verify that l2cap_get_conf_opt provides large enough buffer · 15d6538a
      Marcel Holtmann 提交于
      commit 7c9cbd0b5e38a1672fcd137894ace3b042dfbf69 upstream.
      
      The function l2cap_get_conf_opt will return L2CAP_CONF_OPT_SIZE + opt->len
      as length value. The opt->len however is in control over the remote user
      and can be used by an attacker to gain access beyond the bounds of the
      actual packet.
      
      To prevent any potential leak of heap memory, it is enough to check that
      the resulting len calculation after calling l2cap_get_conf_opt is not
      below zero. A well formed packet will always return >= 0 here and will
      end with the length value being zero after the last option has been
      parsed. In case of malformed packets messing with the opt->len field the
      length value will become negative. If that is the case, then just abort
      and ignore the option.
      
      In case an attacker uses a too short opt->len value, then garbage will
      be parsed, but that is protected by the unknown option handling and also
      the option parameter size checks.
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      15d6538a
    • M
      Bluetooth: Check L2CAP option sizes returned from l2cap_get_conf_opt · 2318c0e4
      Marcel Holtmann 提交于
      commit af3d5d1c87664a4f150fcf3534c6567cb19909b0 upstream.
      
      When doing option parsing for standard type values of 1, 2 or 4 octets,
      the value is converted directly into a variable instead of a pointer. To
      avoid being tricked into being a pointer, check that for these option
      types that sizes actually match. In L2CAP every option is fixed size and
      thus it is prudent anyway to ensure that the remote side sends us the
      right option size along with option paramters.
      
      If the option size is not matching the option type, then that option is
      silently ignored. It is a protocol violation and instead of trying to
      give the remote attacker any further hints just pretend that option is
      not present and proceed with the default values. Implementation
      following the specification and its qualification procedures will always
      use the correct size and thus not being impacted here.
      
      To keep the code readable and consistent accross all options, a few
      cosmetic changes were also required.
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Reviewed-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2318c0e4
  2. 27 3月, 2019 14 次提交
    • G
      Linux 4.19.32 · 3a2156c8
      Greg Kroah-Hartman 提交于
      3a2156c8
    • B
      power: supply: charger-manager: Fix incorrect return value · 33bd347f
      Baolin Wang 提交于
      commit f25a646fbe2051527ad9721853e892d13a99199e upstream.
      
      Fix incorrect return value.
      Signed-off-by: NBaolin Wang <baolin.wang@linaro.org>
      Signed-off-by: NSebastian Reichel <sebastian.reichel@collabora.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      33bd347f
    • H
      ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec · 19184190
      Hui Wang 提交于
      commit b5a236c175b0d984552a5f7c9d35141024c2b261 upstream.
      
      Recently we found the audio jack detection stop working after suspend
      on many machines with Realtek codec. Sometimes the audio selection
      dialogue didn't show up after users plugged headhphone/headset into
      the headset jack, sometimes after uses plugged headphone/headset, then
      click the sound icon on the upper-right corner of gnome-desktop, it
      also showed the speaker rather than the headphone.
      
      The root cause is that before suspend, the codec already call the
      runtime_suspend since this codec is not used by any apps, then in
      resume, it will not call runtime_resume for this codec. But for some
      realtek codec (so far, alc236, alc255 and alc891) with the specific
      BIOS, if it doesn't run runtime_resume after suspend, all codec
      functions including jack detection stop working anymore.
      
      This problem existed for a long time, but it was not exposed, that is
      because when problem happens, if users play sound or open
      sound-setting to check audio device, this will trigger calling to
      runtime_resume (via snd_hda_power_up), then the codec starts working
      again before users notice this problem.
      
      Since we don't know how many codec and BIOS combinations have this
      problem, to fix it, let the driver call runtime_resume for all codecs
      in pm_resume, maybe for some codecs, this is not needed, but it is
      harmless. After a codec is runtime resumed, if it is not used by any
      apps, it will be runtime suspended soon and furthermore we don't run
      suspend frequently, this change will not add much power consumption.
      
      Fixes: cc72da7d ("ALSA: hda - Use standard runtime PM for codec power-save control")
      Signed-off-by: NHui Wang <hui.wang@canonical.com>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19184190
    • T
      ALSA: hda - Record the current power state before suspend/resume calls · 156ba57f
      Takashi Iwai 提交于
      commit 98081ca62cbac31fb0f7efaf90b2e7384ce22257 upstream.
      
      Currently we deal with single codec and suspend codec callbacks for
      all S3, S4 and runtime PM handling.  But it turned out that we want
      distinguish the call patterns sometimes, e.g. for applying some init
      sequence only at probing and restoring from hibernate.
      
      This patch slightly modifies the common PM callbacks for HD-audio
      codec and stores the currently processed PM event in power_state of
      the codec's device.power field, which is currently unused.  The codec
      callback can take a look at this event value and judges which purpose
      it's being called.
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      156ba57f
    • W
      locking/lockdep: Add debug_locks check in __lock_downgrade() · 0e0f7b30
      Waiman Long 提交于
      commit 71492580571467fb7177aade19c18ce7486267f5 upstream.
      
      Tetsuo Handa had reported he saw an incorrect "downgrading a read lock"
      warning right after a previous lockdep warning. It is likely that the
      previous warning turned off lock debugging causing the lockdep to have
      inconsistency states leading to the lock downgrade warning.
      
      Fix that by add a check for debug_locks at the beginning of
      __lock_downgrade().
      Debugged-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Reported-by: NTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Reported-by: syzbot+53383ae265fb161ef488@syzkaller.appspotmail.com
      Signed-off-by: NWaiman Long <longman@redhat.com>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: https://lkml.kernel.org/r/1547093005-26085-1-git-send-email-longman@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e0f7b30
    • J
      x86/unwind: Add hardcoded ORC entry for NULL · 206a76a6
      Jann Horn 提交于
      commit ac5ceccce5501e43d217c596e4ee859f2a3fef79 upstream.
      
      When the ORC unwinder is invoked for an oops caused by IP==0,
      it currently has no idea what to do because there is no debug information
      for the stack frame of NULL.
      
      But if RIP is NULL, it is very likely that the last successfully executed
      instruction was an indirect CALL/JMP, and it is possible to unwind out in
      the same way as for the first instruction of a normal function. Hardcode
      a corresponding ORC entry.
      
      With an artificially-added NULL call in prctl_set_seccomp(), before this
      patch, the trace is:
      
      Call Trace:
       ? __x64_sys_prctl+0x402/0x680
       ? __ia32_sys_prctl+0x6e0/0x6e0
       ? __do_page_fault+0x457/0x620
       ? do_syscall_64+0x6d/0x160
       ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      After this patch, the trace looks like this:
      
      Call Trace:
       __x64_sys_prctl+0x402/0x680
       ? __ia32_sys_prctl+0x6e0/0x6e0
       ? __do_page_fault+0x457/0x620
       do_syscall_64+0x6d/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      prctl_set_seccomp() still doesn't show up in the trace because for some
      reason, tail call optimization is only disabled in builds that use the
      frame pointer unwinder.
      Signed-off-by: NJann Horn <jannh@google.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michal Marek <michal.lkml@markovi.net>
      Cc: linux-kbuild@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190301031201.7416-2-jannh@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      206a76a6
    • J
      x86/unwind: Handle NULL pointer calls better in frame unwinder · 367ccafb
      Jann Horn 提交于
      commit f4f34e1b82eb4219d8eaa1c7e2e17ca219a6a2b5 upstream.
      
      When the frame unwinder is invoked for an oops caused by a call to NULL, it
      currently skips the parent function because BP still points to the parent's
      stack frame; the (nonexistent) current function only has the first half of
      a stack frame, and BP doesn't point to it yet.
      
      Add a special case for IP==0 that calculates a fake BP from SP, then uses
      the real BP for the next frame.
      
      Note that this handles first_frame specially: Return information about the
      parent function as long as the saved IP is >=first_frame, even if the fake
      BP points below it.
      
      With an artificially-added NULL call in prctl_set_seccomp(), before this
      patch, the trace is:
      
      Call Trace:
       ? prctl_set_seccomp+0x3a/0x50
       __x64_sys_prctl+0x457/0x6f0
       ? __ia32_sys_prctl+0x750/0x750
       do_syscall_64+0x72/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      After this patch, the trace is:
      
      Call Trace:
       prctl_set_seccomp+0x3a/0x50
       __x64_sys_prctl+0x457/0x6f0
       ? __ia32_sys_prctl+0x750/0x750
       do_syscall_64+0x72/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: NJann Horn <jannh@google.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: syzbot <syzbot+ca95b2b7aef9e7cbd6ab@syzkaller.appspotmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michal Marek <michal.lkml@markovi.net>
      Cc: linux-kbuild@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190301031201.7416-1-jannh@google.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      367ccafb
    • D
      loop: access lo_backing_file only when the loop device is Lo_bound · 3254dd30
      Dongli Zhang 提交于
      commit f7c8a4120eedf24c36090b7542b179ff7a649219 upstream.
      
      Commit 758a58d0bc67 ("loop: set GENHD_FL_NO_PART_SCAN after
      blkdev_reread_part()") separates "lo->lo_backing_file = NULL" and
      "lo->lo_state = Lo_unbound" into different critical regions protected by
      loop_ctl_mutex.
      
      However, there is below race that the NULL lo->lo_backing_file would be
      accessed when the backend of a loop is another loop device, e.g., loop0's
      backend is a file, while loop1's backend is loop0.
      
      loop0's backend is file            loop1's backend is loop0
      
      __loop_clr_fd()
        mutex_lock(&loop_ctl_mutex);
        lo->lo_backing_file = NULL; --> set to NULL
        mutex_unlock(&loop_ctl_mutex);
                                         loop_set_fd()
                                           mutex_lock_killable(&loop_ctl_mutex);
                                           loop_validate_file()
                                             f = l->lo_backing_file; --> NULL
                                               access if loop0 is not Lo_unbound
        mutex_lock(&loop_ctl_mutex);
        lo->lo_state = Lo_unbound;
        mutex_unlock(&loop_ctl_mutex);
      
      lo->lo_backing_file should be accessed only when the loop device is
      Lo_bound.
      
      In fact, the problem has been introduced already in commit 7ccd0791d985
      ("loop: Push loop_ctl_mutex down into loop_clr_fd()") after which
      loop_validate_file() could see devices in Lo_rundown state with which it
      did not count. It was harmless at that point but still.
      
      Fixes: 7ccd0791d985 ("loop: Push loop_ctl_mutex down into loop_clr_fd()")
      Reported-by: syzbot+9bdc1adc1c55e7fe765b@syzkaller.appspotmail.com
      Signed-off-by: NDongli Zhang <dongli.zhang@oracle.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3254dd30
    • F
      netfilter: ebtables: remove BUGPRINT messages · 35cdcdc5
      Florian Westphal 提交于
      commit d824548dae220820bdf69b2d1561b7c4b072783f upstream.
      
      They are however frequently triggered by syzkaller, so remove them.
      
      ebtables userspace should never trigger any of these, so there is little
      value in making them pr_debug (or ratelimited).
      Signed-off-by: NFlorian Westphal <fw@strlen.de>
      Signed-off-by: NPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35cdcdc5
    • C
      f2fs: fix to avoid deadlock of atomic file operations · 1fd916e8
      Chao Yu 提交于
      commit 48432984d718c95cf13e26d487c2d1b697c3c01f upstream.
      
      Thread A				Thread B
      - __fput
       - f2fs_release_file
        - drop_inmem_pages
         - mutex_lock(&fi->inmem_lock)
         - __revoke_inmem_pages
          - lock_page(page)
      					- open
      					- f2fs_setattr
      					- truncate_setsize
      					 - truncate_inode_pages_range
      					  - lock_page(page)
      					  - truncate_cleanup_page
      					   - f2fs_invalidate_page
      					    - drop_inmem_page
      					    - mutex_lock(&fi->inmem_lock);
      
      We may encounter above ABBA deadlock as reported by Kyungtae Kim:
      
      I'm reporting a bug in linux-4.17.19: "INFO: task hung in
      drop_inmem_page" (no reproducer)
      
      I think this might be somehow related to the following:
      https://groups.google.com/forum/#!searchin/syzkaller-bugs/INFO$3A$20task$20hung$20in$20%7Csort:date/syzkaller-bugs/c6soBTrdaIo/AjAzPeIzCgAJ
      
      =========================================
      INFO: task syz-executor7:10822 blocked for more than 120 seconds.
            Not tainted 4.17.19 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor7   D27024 10822   6346 0x00000004
      Call Trace:
       context_switch kernel/sched/core.c:2867 [inline]
       __schedule+0x721/0x1e60 kernel/sched/core.c:3515
       schedule+0x88/0x1c0 kernel/sched/core.c:3559
       schedule_preempt_disabled+0x18/0x30 kernel/sched/core.c:3617
       __mutex_lock_common kernel/locking/mutex.c:833 [inline]
       __mutex_lock+0x5bd/0x1410 kernel/locking/mutex.c:893
       mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:908
       drop_inmem_page+0xcb/0x810 fs/f2fs/segment.c:327
       f2fs_invalidate_page+0x337/0x5e0 fs/f2fs/data.c:2401
       do_invalidatepage mm/truncate.c:165 [inline]
       truncate_cleanup_page+0x261/0x330 mm/truncate.c:187
       truncate_inode_pages_range+0x552/0x1610 mm/truncate.c:367
       truncate_inode_pages mm/truncate.c:478 [inline]
       truncate_pagecache+0x6d/0x90 mm/truncate.c:801
       truncate_setsize+0x81/0xa0 mm/truncate.c:826
       f2fs_setattr+0x44f/0x1270 fs/f2fs/file.c:781
       notify_change+0xa62/0xe80 fs/attr.c:313
       do_truncate+0x12e/0x1e0 fs/open.c:63
       do_last fs/namei.c:2955 [inline]
       path_openat+0x2042/0x29f0 fs/namei.c:3505
       do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
       do_sys_open+0x35e/0x4e0 fs/open.c:1101
       __do_sys_open fs/open.c:1119 [inline]
       __se_sys_open fs/open.c:1114 [inline]
       __x64_sys_open+0x89/0xc0 fs/open.c:1114
       do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4497b9
      RSP: 002b:00007f734e459c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
      RAX: ffffffffffffffda RBX: 00007f734e45a6cc RCX: 00000000004497b9
      RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
      RBP: 000000000071bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e45a700
      INFO: task syz-executor7:10858 blocked for more than 120 seconds.
            Not tainted 4.17.19 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor7   D28880 10858   6346 0x00000004
      Call Trace:
       context_switch kernel/sched/core.c:2867 [inline]
       __schedule+0x721/0x1e60 kernel/sched/core.c:3515
       schedule+0x88/0x1c0 kernel/sched/core.c:3559
       __rwsem_down_write_failed_common kernel/locking/rwsem-xadd.c:565 [inline]
       rwsem_down_write_failed+0x5e6/0xc90 kernel/locking/rwsem-xadd.c:594
       call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
       __down_write arch/x86/include/asm/rwsem.h:142 [inline]
       down_write+0x58/0xa0 kernel/locking/rwsem.c:72
       inode_lock include/linux/fs.h:713 [inline]
       do_truncate+0x120/0x1e0 fs/open.c:61
       do_last fs/namei.c:2955 [inline]
       path_openat+0x2042/0x29f0 fs/namei.c:3505
       do_filp_open+0x1bd/0x2c0 fs/namei.c:3540
       do_sys_open+0x35e/0x4e0 fs/open.c:1101
       __do_sys_open fs/open.c:1119 [inline]
       __se_sys_open fs/open.c:1114 [inline]
       __x64_sys_open+0x89/0xc0 fs/open.c:1114
       do_syscall_64+0xc4/0x4e0 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4497b9
      RSP: 002b:00007f734e3b4c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
      RAX: ffffffffffffffda RBX: 00007f734e3b56cc RCX: 00000000004497b9
      RDX: 0000000000000104 RSI: 00000000000a8280 RDI: 0000000020000080
      RBP: 000000000071c238 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 0000000000007230 R14: 00000000006f02d0 R15: 00007f734e3b5700
      INFO: task syz-executor5:10829 blocked for more than 120 seconds.
            Not tainted 4.17.19 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor5   D28760 10829   6308 0x80000002
      Call Trace:
       context_switch kernel/sched/core.c:2867 [inline]
       __schedule+0x721/0x1e60 kernel/sched/core.c:3515
       schedule+0x88/0x1c0 kernel/sched/core.c:3559
       io_schedule+0x21/0x80 kernel/sched/core.c:5179
       wait_on_page_bit_common mm/filemap.c:1100 [inline]
       __lock_page+0x2b5/0x390 mm/filemap.c:1273
       lock_page include/linux/pagemap.h:483 [inline]
       __revoke_inmem_pages+0xb35/0x11c0 fs/f2fs/segment.c:231
       drop_inmem_pages+0xa3/0x3e0 fs/f2fs/segment.c:306
       f2fs_release_file+0x2c7/0x330 fs/f2fs/file.c:1556
       __fput+0x2c7/0x780 fs/file_table.c:209
       ____fput+0x1a/0x20 fs/file_table.c:243
       task_work_run+0x151/0x1d0 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x8ba/0x30a0 kernel/exit.c:865
       do_group_exit+0x13b/0x3a0 kernel/exit.c:968
       get_signal+0x6bb/0x1650 kernel/signal.c:2482
       do_signal+0x84/0x1b70 arch/x86/kernel/signal.c:810
       exit_to_usermode_loop+0x155/0x190 arch/x86/entry/common.c:162
       prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
       syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
       do_syscall_64+0x445/0x4e0 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x4497b9
      RSP: 002b:00007f1c68e74ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
      RAX: fffffffffffffe00 RBX: 000000000071bf80 RCX: 00000000004497b9
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000071bf80
      RBP: 000000000071bf80 R08: 0000000000000000 R09: 000000000071bf58
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 0000000000000000 R14: 00007f1c68e759c0 R15: 00007f1c68e75700
      
      This patch tries to use trylock_page to mitigate such deadlock condition
      for fix.
      Signed-off-by: NChao Yu <yuchao0@huawei.com>
      Signed-off-by: NJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1fd916e8
    • M
      RDMA/cma: Rollback source IP address if failing to acquire device · 9dd5053c
      Myungho Jung 提交于
      commit 5fc01fb846bce8fa6d5f95e2625b8ce0f8e86810 upstream.
      
      If cma_acquire_dev_by_src_ip() returns error in addr_handler(), the
      device state changes back to RDMA_CM_ADDR_BOUND but the resolved source
      IP address is still left. After that, if rdma_destroy_id() is called
      after rdma_listen(), the device is freed without removed from
      listen_any_list in cma_cancel_operation(). Revert to the previous IP
      address if acquiring device fails.
      
      Reported-by: syzbot+f3ce716af730c8f96637@syzkaller.appspotmail.com
      Signed-off-by: NMyungho Jung <mhjungk@gmail.com>
      Reviewed-by: NParav Pandit <parav@mellanox.com>
      Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9dd5053c
    • C
      drm: Reorder set_property_atomic to avoid returning with an active ww_ctx · 015b828b
      Chris Wilson 提交于
      commit 227ad6d957898a88b1746e30234ece64d305f066 upstream.
      
      Delay the drm_modeset_acquire_init() until after we check for an
      allocation failure so that we can return immediately upon error without
      having to unwind.
      
      WARNING: lock held when returning to user space!
      4.20.0+ #174 Not tainted
      ------------------------------------------------
      syz-executor556/8153 is leaving the kernel with locks still held!
      1 lock held by syz-executor556/8153:
        #0: 000000005100c85c (crtc_ww_class_acquire){+.+.}, at:
      set_property_atomic+0xb3/0x330 drivers/gpu/drm/drm_mode_object.c:462
      
      Reported-by: syzbot+6ea337c427f5083ebdf2@syzkaller.appspotmail.com
      Fixes: 144a7999 ("drm: Handle properties in the core for atomic drivers")
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Sean Paul <sean@poorly.run>
      Cc: David Airlie <airlied@linux.ie>
      Cc: <stable@vger.kernel.org> # v4.14+
      Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20181230122842.21917-1-chris@chris-wilson.co.ukSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      015b828b
    • K
      Bluetooth: hci_ldisc: Postpone HCI_UART_PROTO_READY bit set in hci_uart_set_proto() · e365b940
      Kefeng Wang 提交于
      commit 56897b217a1d0a91c9920cb418d6b3fe922f590a upstream.
      
      task A:                                task B:
      hci_uart_set_proto                     flush_to_ldisc
       - p->open(hu) -> h5_open  //alloc h5  - receive_buf
       - set_bit HCI_UART_PROTO_READY         - tty_port_default_receive_buf
       - hci_uart_register_dev                 - tty_ldisc_receive_buf
                                                - hci_uart_tty_receive
      				           - test_bit HCI_UART_PROTO_READY
      				            - h5_recv
       - clear_bit HCI_UART_PROTO_READY             while() {
       - p->open(hu) -> h5_close //free h5
      				              - h5_rx_3wire_hdr
      				               - h5_reset()  //use-after-free
                                                    }
      
      It could use ioctl to set hci uart proto, but there is
      a use-after-free issue when hci_uart_register_dev() fail in
      hci_uart_set_proto(), see stack above, fix this by setting
      HCI_UART_PROTO_READY bit only when hci_uart_register_dev()
      return success.
      
      Reported-by: syzbot+899a33dc0fa0dbaf06a6@syzkaller.appspotmail.com
      Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com>
      Reviewed-by: NJeremy Cline <jcline@redhat.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e365b940
    • J
      Bluetooth: hci_ldisc: Initialize hci_dev before open() · f67202f7
      Jeremy Cline 提交于
      commit 32a7b4cbe93b0a0ef7e63d31ca69ce54736c4412 upstream.
      
      The hci_dev struct hdev is referenced in work queues and timers started
      by open() in some protocols. This creates a race between the
      initialization function and the work or timer which can result hdev
      being dereferenced while it is still null.
      
      The syzbot report contains a reliable reproducer which causes a null
      pointer dereference of hdev in hci_uart_write_work() by making the
      memory allocation for hdev fail.
      
      To fix this, ensure hdev is valid from before calling a protocol's
      open() until after calling a protocol's close().
      
      Reported-by: syzbot+257790c15bcdef6fe00c@syzkaller.appspotmail.com
      Signed-off-by: NJeremy Cline <jcline@redhat.com>
      Signed-off-by: NMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f67202f7