1. 13 2月, 2019 40 次提交
    • C
      ALSA: compress: Fix stop handling on compressed capture streams · 93e7d510
      Charles Keepax 提交于
      commit 4f2ab5e1d13d6aa77c55f4914659784efd776eb4 upstream.
      
      It is normal user behaviour to start, stop, then start a stream
      again without closing it. Currently this works for compressed
      playback streams but not capture ones.
      
      The states on a compressed capture stream go directly from OPEN to
      PREPARED, unlike a playback stream which moves to SETUP and waits
      for a write of data before moving to PREPARED. Currently however,
      when a stop is sent the state is set to SETUP for both types of
      streams. This leaves a capture stream in the situation where a new
      start can't be sent as that requires the state to be PREPARED and
      a new set_params can't be sent as that requires the state to be
      OPEN. The only option being to close the stream, and then reopen.
      
      Correct this issues by allowing snd_compr_drain_notify to set the
      state depending on the stream direction, as we already do in
      set_params.
      
      Fixes: 49bb6402 ("ALSA: compress_core: Add support for capture streams")
      Signed-off-by: NCharles Keepax <ckeepax@opensource.cirrus.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: NTakashi Iwai <tiwai@suse.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93e7d510
    • B
      xfs: eof trim writeback mapping as soon as it is cached · 1f78052b
      Brian Foster 提交于
      commit aa6ee4ab69293969867ab09b57546d226ace3d7a upstream.
      
      The cached writeback mapping is EOF trimmed to try and avoid races
      between post-eof block management and writeback that result in
      sending cached data to a stale location. The cached mapping is
      currently trimmed on the validation check, which leaves a race
      window between the time the mapping is cached and when it is trimmed
      against the current inode size.
      
      For example, if a new mapping is cached by delalloc conversion on a
      blocksize == page size fs, we could cycle various locks, perform
      memory allocations, etc.  in the writeback codepath before the
      associated mapping is eventually trimmed to i_size. This leaves
      enough time for a post-eof truncate and file append before the
      cached mapping is trimmed. The former event essentially invalidates
      a range of the cached mapping and the latter bumps the inode size
      such the trim on the next writepage event won't trim all of the
      invalid blocks. fstest generic/464 reproduces this scenario
      occasionally and causes a lost writeback and stale delalloc blocks
      warning on inode inactivation.
      
      To work around this problem, trim the cached writeback mapping as
      soon as it is cached in addition to on subsequent validation checks.
      This is a minor tweak to tighten the race window as much as possible
      until a proper invalidation mechanism is available.
      
      Fixes: 40214d12 ("xfs: trim writepage mapping to within eof")
      Cc: <stable@vger.kernel.org> # v4.14+
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NAllison Henderson <allison.henderson@oracle.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f78052b
    • R
      net/mlx5e: FPGA, fix Innova IPsec TX offload data path performance · 1f1eb00c
      Raed Salem 提交于
      [ Upstream commit 82eaa1fa0448da1852d7b80832e67e80a08dcc27 ]
      
      At Innova IPsec TX offload data path a special software parser metadata
      is used to pass some packet attributes to the hardware, this metadata
      is passed using the Ethernet control segment of a WQE (a HW descriptor)
      header.
      
      The cited commit might nullify this header, hence the metadata is lost,
      this caused a significant performance drop during hw offloading
      operation.
      
      Fix by restoring the metadata at the Ethernet control segment in case
      it was nullified.
      
      Fixes: 37fdffb2 ("net/mlx5: WQ, fixes for fragmented WQ buffers API")
      Signed-off-by: NRaed Salem <raeds@mellanox.com>
      Reviewed-by: NTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f1eb00c
    • T
      virtio_net: Account for tx bytes and packets on sending xdp_frames · 9143e1dc
      Toshiaki Makita 提交于
      [ Upstream commit 546f28974d771b124fb0bf7b551b343888cf0419 ]
      
      Previously virtnet_xdp_xmit() did not account for device tx counters,
      which caused confusions.
      To be consistent with SKBs, account them on freeing xdp_frames.
      Reported-by: NDavid Ahern <dsahern@gmail.com>
      Signed-off-by: NToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: NMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9143e1dc
    • D
      skge: potential memory corruption in skge_get_regs() · c8ba1f79
      Dan Carpenter 提交于
      [ Upstream commit 294c149a209c6196c2de85f512b52ef50f519949 ]
      
      The "p" buffer is 0x4000 bytes long.  B3_RI_WTO_R1 is 0x190.  The value
      of "regs->len" is in the 1-0x4000 range.  The bug here is that
      "regs->len - B3_RI_WTO_R1" can be a negative value which would lead to
      memory corruption and an abrupt crash.
      
      Fixes: c3f8be96 ("[PATCH] skge: expand ethtool debug register dump")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c8ba1f79
    • G
      sctp: walk the list of asoc safely · 7c236130
      Greg Kroah-Hartman 提交于
      [ Upstream commit ba59fb0273076637f0add4311faa990a5eec27c0 ]
      
      In sctp_sendmesg(), when walking the list of endpoint associations, the
      association can be dropped from the list, making the list corrupt.
      Properly handle this by using list_for_each_entry_safe()
      
      Fixes: 49102805 ("sctp: add support for snd flag SCTP_SENDALL process in sendmsg")
      Reported-by: NSecunia Research <vuln@secunia.com>
      Tested-by: NSecunia Research <vuln@secunia.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: NMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c236130
    • X
      sctp: check and update stream->out_curr when allocating stream_out · 7cd4e833
      Xin Long 提交于
      [ Upstream commit cfe4bd7a257f6d6f81d3458d8c9d9ec4957539e6 ]
      
      Now when using stream reconfig to add out streams, stream->out
      will get re-allocated, and all old streams' information will
      be copied to the new ones and the old ones will be freed.
      
      So without stream->out_curr updated, next time when trying to
      send from stream->out_curr stream, a panic would be caused.
      
      This patch is to check and update stream->out_curr when
      allocating stream_out.
      
      v1->v2:
        - define fa_index() to get elem index from stream->out_curr.
      v2->v3:
        - repost with no change.
      
      Fixes: 5bbbbe32 ("sctp: introduce stream scheduler foundations")
      Reported-by: NYing Xu <yinxu@redhat.com>
      Reported-by: syzbot+e33a3a138267ca119c7d@syzkaller.appspotmail.com
      Signed-off-by: NXin Long <lucien.xin@gmail.com>
      Acked-by: NNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7cd4e833
    • E
      rxrpc: bad unlock balance in rxrpc_recvmsg · 21b8697e
      Eric Dumazet 提交于
      [ Upstream commit 6dce3c20ac429e7a651d728e375853370c796e8d ]
      
      When either "goto wait_interrupted;" or "goto wait_error;"
      paths are taken, socket lock has already been released.
      
      This patch fixes following syzbot splat :
      
      WARNING: bad unlock balance detected!
      5.0.0-rc4+ #59 Not tainted
      -------------------------------------
      syz-executor223/8256 is trying to release lock (sk_lock-AF_RXRPC) at:
      [<ffffffff86651353>] rxrpc_recvmsg+0x6d3/0x3099 net/rxrpc/recvmsg.c:598
      but there are no more locks to release!
      
      other info that might help us debug this:
      1 lock held by syz-executor223/8256:
       #0: 00000000fa9ed0f4 (slock-AF_RXRPC){+...}, at: spin_lock_bh include/linux/spinlock.h:334 [inline]
       #0: 00000000fa9ed0f4 (slock-AF_RXRPC){+...}, at: release_sock+0x20/0x1c0 net/core/sock.c:2798
      
      stack backtrace:
      CPU: 1 PID: 8256 Comm: syz-executor223 Not tainted 5.0.0-rc4+ #59
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_unlock_imbalance_bug kernel/locking/lockdep.c:3391 [inline]
       print_unlock_imbalance_bug.cold+0x114/0x123 kernel/locking/lockdep.c:3368
       __lock_release kernel/locking/lockdep.c:3601 [inline]
       lock_release+0x67e/0xa00 kernel/locking/lockdep.c:3860
       sock_release_ownership include/net/sock.h:1471 [inline]
       release_sock+0x183/0x1c0 net/core/sock.c:2808
       rxrpc_recvmsg+0x6d3/0x3099 net/rxrpc/recvmsg.c:598
       sock_recvmsg_nosec net/socket.c:794 [inline]
       sock_recvmsg net/socket.c:801 [inline]
       sock_recvmsg+0xd0/0x110 net/socket.c:797
       __sys_recvfrom+0x1ff/0x350 net/socket.c:1845
       __do_sys_recvfrom net/socket.c:1863 [inline]
       __se_sys_recvfrom net/socket.c:1859 [inline]
       __x64_sys_recvfrom+0xe1/0x1a0 net/socket.c:1859
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x446379
      Code: e8 2c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fe5da89fd98 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
      RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446379
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
      RBP: 00000000006dbc20 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc2c
      R13: 0000000000000000 R14: 0000000000000000 R15: 20c49ba5e353f7cf
      
      Fixes: 248f219c ("rxrpc: Rewrite the data and ack handling code")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Cc: David Howells <dhowells@redhat.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21b8697e
    • R
      Revert "net: phy: marvell: avoid pause mode on SGMII-to-Copper for 88e151x" · 7c1a5ce6
      Russell King 提交于
      [ Upstream commit c14f07c6211cc01d52ed92cce1fade5071b8d197 ]
      
      This reverts commit 6623c0fb.
      
      The original diagnosis was incorrect: it appears that the NIC had
      PHY polling mode enabled, which meant that it overwrote the PHYs
      advertisement register during negotiation.
      Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Tested-by: NYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c1a5ce6
    • E
      rds: fix refcount bug in rds_sock_addref · f2f054c4
      Eric Dumazet 提交于
      [ Upstream commit 6fa19f5637a6c22bc0999596bcc83bdcac8a4fa6 ]
      
      syzbot was able to catch a bug in rds [1]
      
      The issue here is that the socket might be found in a hash table
      but that its refcount has already be set to 0 by another cpu.
      
      We need to use refcount_inc_not_zero() to be safe here.
      
      [1]
      
      refcount_t: increment on 0; use-after-free.
      WARNING: CPU: 1 PID: 23129 at lib/refcount.c:153 refcount_inc_checked lib/refcount.c:153 [inline]
      WARNING: CPU: 1 PID: 23129 at lib/refcount.c:153 refcount_inc_checked+0x61/0x70 lib/refcount.c:151
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 1 PID: 23129 Comm: syz-executor3 Not tainted 5.0.0-rc4+ #53
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1db/0x2d0 lib/dump_stack.c:113
       panic+0x2cb/0x65c kernel/panic.c:214
       __warn.cold+0x20/0x48 kernel/panic.c:571
       report_bug+0x263/0x2b0 lib/bug.c:186
       fixup_bug arch/x86/kernel/traps.c:178 [inline]
       fixup_bug arch/x86/kernel/traps.c:173 [inline]
       do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
       do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:290
       invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
      RIP: 0010:refcount_inc_checked lib/refcount.c:153 [inline]
      RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:151
      Code: 1d 51 63 c8 06 31 ff 89 de e8 eb 1b f2 fd 84 db 75 dd e8 a2 1a f2 fd 48 c7 c7 60 9f 81 88 c6 05 31 63 c8 06 01 e8 af 65 bb fd <0f> 0b eb c1 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 49
      RSP: 0018:ffff8880a0cbf1e8 EFLAGS: 00010282
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc90006113000
      RDX: 000000000001047d RSI: ffffffff81685776 RDI: 0000000000000005
      RBP: ffff8880a0cbf1f8 R08: ffff888097c9e100 R09: ffffed1015ce5021
      R10: ffffed1015ce5020 R11: ffff8880ae728107 R12: ffff8880723c20c0
      R13: ffff8880723c24b0 R14: dffffc0000000000 R15: ffffed1014197e64
       sock_hold include/net/sock.h:647 [inline]
       rds_sock_addref+0x19/0x20 net/rds/af_rds.c:675
       rds_find_bound+0x97c/0x1080 net/rds/bind.c:82
       rds_recv_incoming+0x3be/0x1430 net/rds/recv.c:362
       rds_loop_xmit+0xf3/0x2a0 net/rds/loop.c:96
       rds_send_xmit+0x1355/0x2a10 net/rds/send.c:355
       rds_sendmsg+0x323c/0x44e0 net/rds/send.c:1368
       sock_sendmsg_nosec net/socket.c:621 [inline]
       sock_sendmsg+0xdd/0x130 net/socket.c:631
       __sys_sendto+0x387/0x5f0 net/socket.c:1788
       __do_sys_sendto net/socket.c:1800 [inline]
       __se_sys_sendto net/socket.c:1796 [inline]
       __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1796
       do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x458089
      Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fc266df8c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000458089
      RDX: 0000000000000000 RSI: 00000000204b3fff RDI: 0000000000000005
      RBP: 000000000073bf00 R08: 00000000202b4000 R09: 0000000000000010
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc266df96d4
      R13: 00000000004c56e4 R14: 00000000004d94a8 R15: 00000000ffffffff
      
      Fixes: cc4dfb7f ("rds: fix two RCU related problems")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com>
      Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
      Cc: rds-devel@oss.oracle.com
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f2f054c4
    • F
      net: systemport: Fix WoL with password after deep sleep · 2c24ed77
      Florian Fainelli 提交于
      [ Upstream commit 8dfb8d2cceb76b74ad5b58cc65c75994329b4d5e ]
      
      Broadcom STB chips support a deep sleep mode where all register
      contents are lost. Because we were stashing the MagicPacket password
      into some of these registers a suspend into that deep sleep then a
      resumption would not lead to being able to wake-up from MagicPacket with
      password again.
      
      Fix this by keeping a software copy of the password and program it
      during suspend.
      
      Fixes: 83e82f4c ("net: systemport: add Wake-on-LAN support")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2c24ed77
    • C
      net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames · b72ea6ec
      Cong Wang 提交于
      [ Upstream commit e8c8b53ccaff568fef4c13a6ccaf08bf241aa01a ]
      
      When an ethernet frame is padded to meet the minimum ethernet frame
      size, the padding octets are not covered by the hardware checksum.
      Fortunately the padding octets are usually zero's, which don't affect
      checksum. However, we have a switch which pads non-zero octets, this
      causes kernel hardware checksum fault repeatedly.
      
      Prior to:
      commit '88078d98 ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE ...")'
      skb checksum was forced to be CHECKSUM_NONE when padding is detected.
      After it, we need to keep skb->csum updated, like what we do for RXFCS.
      However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP
      headers, it is not worthy the effort as the packets are so small that
      CHECKSUM_COMPLETE can't save anything.
      
      Fixes: 88078d98 ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"),
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Cc: Nikola Ciprich <nikola.ciprich@linuxbox.cz>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b72ea6ec
    • R
      net: dsa: slave: Don't propagate flag changes on down slave interfaces · 8104a5e7
      Rundong Ge 提交于
      [ Upstream commit 17ab4f61b8cd6f9c38e9d0b935d86d73b5d0d2b5 ]
      
      The unbalance of master's promiscuity or allmulti will happen after ifdown
      and ifup a slave interface which is in a bridge.
      
      When we ifdown a slave interface , both the 'dsa_slave_close' and
      'dsa_slave_change_rx_flags' will clear the master's flags. The flags
      of master will be decrease twice.
      In the other hand, if we ifup the slave interface again, since the
      slave's flags were cleared the 'dsa_slave_open' won't set the master's
      flag, only 'dsa_slave_change_rx_flags' that triggered by 'br_add_if'
      will set the master's flags. The flags of master is increase once.
      
      Only propagating flag changes when a slave interface is up makes
      sure this does not happen. The 'vlan_dev_change_rx_flags' had the
      same problem and was fixed, and changes here follows that fix.
      
      Fixes: 91da11f8 ("net: Distributed Switch Architecture protocol support")
      Signed-off-by: NRundong Ge <rdong.ge@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8104a5e7
    • A
      net: dsa: mv88e6xxx: Fix counting of ATU violations · 27a2fa00
      Andrew Lunn 提交于
      [ Upstream commit 75c05a74e745ae7d663b04d75777af80ada2233c ]
      
      The ATU port vector contains a bit per port of the switch. The code
      wrongly used it as a port number, and incremented a port counter. This
      resulted in the wrong interfaces counter being incremented, and
      potentially going off the end of the array of ports.
      
      Fix this by using the source port ID for the violation, which really
      is a port number.
      Reported-by: NChris Healy <Chris.Healy@zii.aero>
      Tested-by: NChris Healy <Chris.Healy@zii.aero>
      Fixes: 65f60e45 ("net: dsa: mv88e6xxx: Keep ATU/VTU violation statistics")
      Signed-off-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      27a2fa00
    • D
      net: dsa: Fix NULL checking in dsa_slave_set_eee() · c8dfab5c
      Dan Carpenter 提交于
      [ Upstream commit 00670cb8a73b10b10d3c40f045c15411715e4465 ]
      
      This function can't succeed if dp->pl is NULL.  It will Oops inside the
      call to return phylink_ethtool_get_eee(dp->pl, e);
      
      Fixes: 1be52e97 ("dsa: slave: eee: Allow ports to use phylink")
      Signed-off-by: NDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: NVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c8dfab5c
    • M
      net: dsa: Fix lockdep false positive splat · 98cedccb
      Marc Zyngier 提交于
      [ Upstream commit c8101f7729daee251f4f6505f9d135ec08e1342f ]
      
      Creating a macvtap on a DSA-backed interface results in the following
      splat when lockdep is enabled:
      
      [   19.638080] IPv6: ADDRCONF(NETDEV_CHANGE): lan0: link becomes ready
      [   23.041198] device lan0 entered promiscuous mode
      [   23.043445] device eth0 entered promiscuous mode
      [   23.049255]
      [   23.049557] ============================================
      [   23.055021] WARNING: possible recursive locking detected
      [   23.060490] 5.0.0-rc3-00013-g56c857a1b8d3 #118 Not tainted
      [   23.066132] --------------------------------------------
      [   23.071598] ip/2861 is trying to acquire lock:
      [   23.076171] 00000000f61990cb (_xmit_ETHER){+...}, at: dev_set_rx_mode+0x1c/0x38
      [   23.083693]
      [   23.083693] but task is already holding lock:
      [   23.089696] 00000000ecf0c3b4 (_xmit_ETHER){+...}, at: dev_uc_add+0x24/0x70
      [   23.096774]
      [   23.096774] other info that might help us debug this:
      [   23.103494]  Possible unsafe locking scenario:
      [   23.103494]
      [   23.109584]        CPU0
      [   23.112093]        ----
      [   23.114601]   lock(_xmit_ETHER);
      [   23.117917]   lock(_xmit_ETHER);
      [   23.121233]
      [   23.121233]  *** DEADLOCK ***
      [   23.121233]
      [   23.127325]  May be due to missing lock nesting notation
      [   23.127325]
      [   23.134315] 2 locks held by ip/2861:
      [   23.137987]  #0: 000000003b766c72 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x338/0x4e0
      [   23.146231]  #1: 00000000ecf0c3b4 (_xmit_ETHER){+...}, at: dev_uc_add+0x24/0x70
      [   23.153757]
      [   23.153757] stack backtrace:
      [   23.158243] CPU: 0 PID: 2861 Comm: ip Not tainted 5.0.0-rc3-00013-g56c857a1b8d3 #118
      [   23.166212] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
      [   23.172843] Call trace:
      [   23.175358]  dump_backtrace+0x0/0x188
      [   23.179116]  show_stack+0x14/0x20
      [   23.182524]  dump_stack+0xb4/0xec
      [   23.185928]  __lock_acquire+0x123c/0x1860
      [   23.190048]  lock_acquire+0xc8/0x248
      [   23.193724]  _raw_spin_lock_bh+0x40/0x58
      [   23.197755]  dev_set_rx_mode+0x1c/0x38
      [   23.201607]  dev_set_promiscuity+0x3c/0x50
      [   23.205820]  dsa_slave_change_rx_flags+0x5c/0x70
      [   23.210567]  __dev_set_promiscuity+0x148/0x1e0
      [   23.215136]  __dev_set_rx_mode+0x74/0x98
      [   23.219167]  dev_uc_add+0x54/0x70
      [   23.222575]  macvlan_open+0x170/0x1d0
      [   23.226336]  __dev_open+0xe0/0x160
      [   23.229830]  __dev_change_flags+0x16c/0x1b8
      [   23.234132]  dev_change_flags+0x20/0x60
      [   23.238074]  do_setlink+0x2d0/0xc50
      [   23.241658]  __rtnl_newlink+0x5f8/0x6e8
      [   23.245601]  rtnl_newlink+0x50/0x78
      [   23.249184]  rtnetlink_rcv_msg+0x360/0x4e0
      [   23.253397]  netlink_rcv_skb+0xe8/0x130
      [   23.257338]  rtnetlink_rcv+0x14/0x20
      [   23.261012]  netlink_unicast+0x190/0x210
      [   23.265043]  netlink_sendmsg+0x288/0x350
      [   23.269075]  sock_sendmsg+0x18/0x30
      [   23.272659]  ___sys_sendmsg+0x29c/0x2c8
      [   23.276602]  __sys_sendmsg+0x60/0xb8
      [   23.280276]  __arm64_sys_sendmsg+0x1c/0x28
      [   23.284488]  el0_svc_common+0xd8/0x138
      [   23.288340]  el0_svc_handler+0x24/0x80
      [   23.292192]  el0_svc+0x8/0xc
      
      This looks fairly harmless (no actual deadlock occurs), and is
      fixed in a similar way to c6894dec ("bridge: fix lockdep
      addr_list_lock false positive splat") by putting the addr_list_lock
      in its own lockdep class.
      Signed-off-by: NMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98cedccb
    • S
      net: dp83640: expire old TX-skb · 8e1428c9
      Sebastian Andrzej Siewior 提交于
      [ Upstream commit 53bc8d2af08654659abfadfd3e98eb9922ff787c ]
      
      During sendmsg() a cloned skb is saved via dp83640_txtstamp() in
      ->tx_queue. After the NIC sends this packet, the PHY will reply with a
      timestamp for that TX packet. If the cable is pulled at the right time I
      don't see that packet. It might gets flushed as part of queue shutdown
      on NIC's side.
      Once the link is up again then after the next sendmsg() we enqueue
      another skb in dp83640_txtstamp() and have two on the list. Then the PHY
      will send a reply and decode_txts() attaches it to the first skb on the
      list.
      No crash occurs since refcounting works but we are one packet behind.
      linuxptp/ptp4l usually closes the socket and opens a new one (in such a
      timeout case) so those "stale" replies never get there. However it does
      not resume normal operation anymore.
      
      Purge old skbs in decode_txts().
      
      Fixes: cb646e2b ("ptp: Added a clock driver for the National Semiconductor PHYTER.")
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: NKurt Kanzenbach <kurt@linutronix.de>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8e1428c9
    • B
      lib/test_rhashtable: Make test_insert_dup() allocate its hash table dynamically · 81733c64
      Bart Van Assche 提交于
      [ Upstream commit fc42a689c4c097859e5bd37b5ea11b60dc426df6 ]
      
      The test_insert_dup() function from lib/test_rhashtable.c passes a
      pointer to a stack object to rhltable_init(). Allocate the hash table
      dynamically to avoid that the following is reported with object
      debugging enabled:
      
      ODEBUG: object (ptrval) is on stack (ptrval), but NOT annotated.
      WARNING: CPU: 0 PID: 1 at lib/debugobjects.c:368 __debug_object_init+0x312/0x480
      Modules linked in:
      EIP: __debug_object_init+0x312/0x480
      Call Trace:
       ? debug_object_init+0x1a/0x20
       ? __init_work+0x16/0x30
       ? rhashtable_init+0x1e1/0x460
       ? sched_clock_cpu+0x57/0xe0
       ? rhltable_init+0xb/0x20
       ? test_insert_dup+0x32/0x20f
       ? trace_hardirqs_on+0x38/0xf0
       ? ida_dump+0x10/0x10
       ? jhash+0x130/0x130
       ? my_hashfn+0x30/0x30
       ? test_rht_init+0x6aa/0xab4
       ? ida_dump+0x10/0x10
       ? test_rhltable+0xc5c/0xc5c
       ? do_one_initcall+0x67/0x28e
       ? trace_hardirqs_off+0x22/0xe0
       ? restore_all_kernel+0xf/0x70
       ? trace_hardirqs_on_thunk+0xc/0x10
       ? restore_all_kernel+0xf/0x70
       ? kernel_init_freeable+0x142/0x213
       ? rest_init+0x230/0x230
       ? kernel_init+0x10/0x110
       ? schedule_tail_wrapper+0x9/0xc
       ? ret_from_fork+0x19/0x24
      
      Cc: Thomas Graf <tgraf@suug.ch>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: netdev@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NBart Van Assche <bvanassche@acm.org>
      Acked-by: NHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81733c64
    • G
      enic: fix checksum validation for IPv6 · cedc42f5
      Govindarajulu Varadarajan 提交于
      [ Upstream commit 7596175e99b3d4bce28022193efd954c201a782a ]
      
      In case of IPv6 pkts, ipv4_csum_ok is 0. Because of this, driver does
      not set skb->ip_summed. So IPv6 rx checksum is not offloaded.
      Signed-off-by: NGovindarajulu Varadarajan <gvaradar@cisco.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cedc42f5
    • E
      dccp: fool proof ccid_hc_[rt]x_parse_options() · 15ed55e3
      Eric Dumazet 提交于
      [ Upstream commit 9b1f19d810e92d6cdc68455fbc22d9f961a58ce1 ]
      
      Similarly to commit 276bdb82 ("dccp: check ccid before dereferencing")
      it is wise to test for a NULL ccid.
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.0.0-rc3+ #37
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:ccid_hc_tx_parse_options net/dccp/ccid.h:205 [inline]
      RIP: 0010:dccp_parse_options+0x8d9/0x12b0 net/dccp/options.c:233
      Code: c5 0f b6 75 b3 80 38 00 0f 85 d6 08 00 00 48 b9 00 00 00 00 00 fc ff df 48 8b 45 b8 4c 8b b8 f8 07 00 00 4c 89 f8 48 c1 e8 03 <80> 3c 08 00 0f 85 95 08 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b
      kobject: 'loop5' (0000000080f78fc1): kobject_uevent_env
      RSP: 0018:ffff8880a94df0b8 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff8880858ac723 RCX: dffffc0000000000
      RDX: 0000000000000100 RSI: 0000000000000007 RDI: 0000000000000001
      RBP: ffff8880a94df140 R08: 0000000000000001 R09: ffff888061b83a80
      R10: ffffed100c370752 R11: ffff888061b83a97 R12: 0000000000000026
      R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f0defa33518 CR3: 000000008db5e000 CR4: 00000000001406e0
      kobject: 'loop5' (0000000080f78fc1): fill_kobj_path: path = '/devices/virtual/block/loop5'
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       dccp_rcv_state_process+0x2b6/0x1af6 net/dccp/input.c:654
       dccp_v4_do_rcv+0x100/0x190 net/dccp/ipv4.c:688
       sk_backlog_rcv include/net/sock.h:936 [inline]
       __sk_receive_skb+0x3a9/0xea0 net/core/sock.c:473
       dccp_v4_rcv+0x10cb/0x1f80 net/dccp/ipv4.c:880
       ip_protocol_deliver_rcu+0xb6/0xa20 net/ipv4/ip_input.c:208
       ip_local_deliver_finish+0x23b/0x390 net/ipv4/ip_input.c:234
       NF_HOOK include/linux/netfilter.h:289 [inline]
       NF_HOOK include/linux/netfilter.h:283 [inline]
       ip_local_deliver+0x1f0/0x740 net/ipv4/ip_input.c:255
       dst_input include/net/dst.h:450 [inline]
       ip_rcv_finish+0x1f4/0x2f0 net/ipv4/ip_input.c:414
       NF_HOOK include/linux/netfilter.h:289 [inline]
       NF_HOOK include/linux/netfilter.h:283 [inline]
       ip_rcv+0xed/0x620 net/ipv4/ip_input.c:524
       __netif_receive_skb_one_core+0x160/0x210 net/core/dev.c:4973
       __netif_receive_skb+0x2c/0x1c0 net/core/dev.c:5083
       process_backlog+0x206/0x750 net/core/dev.c:5923
       napi_poll net/core/dev.c:6346 [inline]
       net_rx_action+0x76d/0x1930 net/core/dev.c:6412
       __do_softirq+0x30b/0xb11 kernel/softirq.c:292
       run_ksoftirqd kernel/softirq.c:654 [inline]
       run_ksoftirqd+0x8e/0x110 kernel/softirq.c:646
       smpboot_thread_fn+0x6ab/0xa10 kernel/smpboot.c:164
       kthread+0x357/0x430 kernel/kthread.c:246
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
      Modules linked in:
      ---[ end trace 58a0ba03bea2c376 ]---
      RIP: 0010:ccid_hc_tx_parse_options net/dccp/ccid.h:205 [inline]
      RIP: 0010:dccp_parse_options+0x8d9/0x12b0 net/dccp/options.c:233
      Code: c5 0f b6 75 b3 80 38 00 0f 85 d6 08 00 00 48 b9 00 00 00 00 00 fc ff df 48 8b 45 b8 4c 8b b8 f8 07 00 00 4c 89 f8 48 c1 e8 03 <80> 3c 08 00 0f 85 95 08 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b
      RSP: 0018:ffff8880a94df0b8 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff8880858ac723 RCX: dffffc0000000000
      RDX: 0000000000000100 RSI: 0000000000000007 RDI: 0000000000000001
      RBP: ffff8880a94df140 R08: 0000000000000001 R09: ffff888061b83a80
      R10: ffffed100c370752 R11: ffff888061b83a97 R12: 0000000000000026
      R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f0defa33518 CR3: 0000000009871000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      15ed55e3
    • E
      thermal: hwmon: inline helpers when CONFIG_THERMAL_HWMON is not set · 56ea9164
      Eduardo Valentin 提交于
      commit 03334ba8b425b2ad275c8f390cf83c7b081c3095 upstream.
      
      Avoid warnings like this:
      thermal_hwmon.h:29:1: warning: ‘thermal_remove_hwmon_sysfs’ defined but not used [-Wunused-function]
       thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz)
      
      Fixes: 0dd88793 ("thermal: hwmon: move hwmon support to single file")
      Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: NEduardo Valentin <edubezval@gmail.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56ea9164
    • E
      xfs: fix inverted return from xfs_btree_sblock_verify_crc · 0c802cba
      Eric Sandeen 提交于
      commit 7d048df4e9b05ba89b74d062df59498aa81f3785 upstream.
      
      xfs_btree_sblock_verify_crc is a bool so should not be returning
      a failaddr_t; worse, if xfs_log_check_lsn fails it returns
      __this_address which looks like a boolean true (i.e. success)
      to the caller.
      
      (interestingly xfs_btree_lblock_verify_crc doesn't have the issue)
      Signed-off-by: NEric Sandeen <sandeen@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      0c802cba
    • D
      xfs: fix PAGE_MASK usage in xfs_free_file_space · c6c20af6
      Darrick J. Wong 提交于
      commit a579121f94aba4e8bad1a121a0fad050d6925296 upstream.
      
      In commit e53c4b59, I *tried* to teach xfs to force writeback when we
      fzero/fpunch right up to EOF so that if EOF is in the middle of a page,
      the post-EOF part of the page gets zeroed before we return to userspace.
      Unfortunately, I missed the part where PAGE_MASK is ~(PAGE_SIZE - 1),
      which means that we totally fail to zero if we're fpunching and EOF is
      within the first page.  Worse yet, the same PAGE_MASK thinko plagues the
      filemap_write_and_wait_range call, so we'd initiate writeback of the
      entire file, which (mostly) masked the thinko.
      
      Drop the tricky PAGE_MASK and replace it with correct usage of PAGE_SIZE
      and the proper rounding macros.
      
      Fixes: e53c4b59 ("xfs: ensure post-EOF zeroing happens after zeroing part of a file")
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c6c20af6
    • Y
      fs/xfs: fix f_ffree value for statfs when project quota is set · 757332c6
      Ye Yin 提交于
      commit de7243057e7cefa923fa5f467c0f1ec24eef41d2 upsream.
      
      When project is set, we should use inode limit minus the used count
      Signed-off-by: NYe Yin <dbyin@tencent.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      757332c6
    • D
      xfs: delalloc -> unwritten COW fork allocation can go wrong · 886f0de1
      Dave Chinner 提交于
      commit 9230a0b65b47fe6856c4468ec0175c4987e5bede upstream.
      
      Long saga. There have been days spent following this through dead end
      after dead end in multi-GB event traces. This morning, after writing
      a trace-cmd wrapper that enabled me to be more selective about XFS
      trace points, I discovered that I could get just enough essential
      tracepoints enabled that there was a 50:50 chance the fsx config
      would fail at ~115k ops. If it didn't fail at op 115547, I stopped
      fsx at op 115548 anyway.
      
      That gave me two traces - one where the problem manifested, and one
      where it didn't. After refining the traces to have the necessary
      information, I found that in the failing case there was a real
      extent in the COW fork compared to an unwritten extent in the
      working case.
      
      Walking back through the two traces to the point where the CWO fork
      extents actually diverged, I found that the bad case had an extra
      unwritten extent in it. This is likely because the bug it led me to
      had triggered multiple times in those 115k ops, leaving stray
      COW extents around. What I saw was a COW delalloc conversion to an
      unwritten extent (as they should always be through
      xfs_iomap_write_allocate()) resulted in a /written extent/:
      
      xfs_writepage:        dev 259:0 ino 0x83 pgoff 0x17000 size 0x79a00 offset 0 length 0
      xfs_iext_remove:      dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/2 offset 32 block 152 count 20 flag 1 caller xfs_bmap_add_extent_delay_real
      xfs_bmap_pre_update:  dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/1 offset 1 block 4503599627239429 count 31 flag 0 caller xfs_bmap_add_extent_delay_real
      xfs_bmap_post_update: dev 259:0 ino 0x83 state RC|LF|RF|COW cur 0xffff888247b899c0/1 offset 1 block 121 count 51 flag 0 caller xfs_bmap_add_ex
      
      Basically, Cow fork before:
      
      	0 1            32          52
      	+H+DDDDDDDDDDDD+UUUUUUUUUUU+
      	   PREV		RIGHT
      
      COW delalloc conversion allocates:
      
      	  1	       32
      	  +uuuuuuuuuuuu+
      	  NEW
      
      And the result according to the xfs_bmap_post_update trace was:
      
      	0 1            32          52
      	+H+wwwwwwwwwwwwwwwwwwwwwwww+
      	   PREV
      
      Which is clearly wrong - it should be a merged unwritten extent,
      not an unwritten extent.
      
      That lead me to look at the LEFT_FILLING|RIGHT_FILLING|RIGHT_CONTIG
      case in xfs_bmap_add_extent_delay_real(), and sure enough, there's
      the bug.
      
      It takes the old delalloc extent (PREV) and adds the length of the
      RIGHT extent to it, takes the start block from NEW, removes the
      RIGHT extent and then updates PREV with the new extent.
      
      What it fails to do is update PREV.br_state. For delalloc, this is
      always XFS_EXT_NORM, while in this case we are converting the
      delayed allocation to unwritten, so it needs to be updated to
      XFS_EXT_UNWRITTEN. This LF|RF|RC case does not do this, and so
      the resultant extent is always written.
      
      And that's the bug I've been chasing for a week - a bmap btree bug,
      not a reflink/dedupe/copy_file_range bug, but a BMBT bug introduced
      with the recent in core extent tree scalability enhancements.
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      886f0de1
    • D
      xfs: fix transient reference count error in xfs_buf_resubmit_failed_buffers · 5a7455e9
      Dave Chinner 提交于
      commit d43aaf1685aa471f0593685c9f54d53e3af3cf3f upstream.
      
      When retrying a failed inode or dquot buffer,
      xfs_buf_resubmit_failed_buffers() clears all the failed flags from
      the inde/dquot log items. In doing so, it also drops all the
      reference counts on the buffer that the failed log items hold. This
      means it can drop all the active references on the buffer and hence
      free the buffer before it queues it for write again.
      
      Putting the buffer on the delwri queue takes a reference to the
      buffer (so that it hangs around until it has been written and
      completed), but this goes bang if the buffer has already been freed.
      
      Hence we need to add the buffer to the delwri queue before we remove
      the failed flags from the log items attached to the buffer to ensure
      it always remains referenced during the resubmit process.
      Reported-by: NJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      5a7455e9
    • B
      xfs: fix shared extent data corruption due to missing cow reservation · c3a66bf4
      Brian Foster 提交于
      commit 59e4293149106fb92530f8e56fa3992d8548c5e6 upstream.
      
      Page writeback indirectly handles shared extents via the existence
      of overlapping COW fork blocks. If COW fork blocks exist, writeback
      always performs the associated copy-on-write regardless if the
      underlying blocks are actually shared. If the blocks are shared,
      then overlapping COW fork blocks must always exist.
      
      fstests shared/010 reproduces a case where a buffered write occurs
      over a shared block without performing the requisite COW fork
      reservation.  This ultimately causes writeback to the shared extent
      and data corruption that is detected across md5 checks of the
      filesystem across a mount cycle.
      
      The problem occurs when a buffered write lands over a shared extent
      that crosses an extent size hint boundary and that also happens to
      have a partial COW reservation that doesn't cover the start and end
      blocks of the data fork extent.
      
      For example, a buffered write occurs across the file offset (in FSB
      units) range of [29, 57]. A shared extent exists at blocks [29, 35]
      and COW reservation already exists at blocks [32, 34]. After
      accommodating a COW extent size hint of 32 blocks and the existing
      reservation at offset 32, xfs_reflink_reserve_cow() allocates 32
      blocks of reservation at offset 0 and returns with COW reservation
      across the range of [0, 34]. The associated data fork extent is
      still [29, 35], however, which isn't fully covered by the COW
      reservation.
      
      This leads to a buffered write at file offset 35 over a shared
      extent without associated COW reservation. Writeback eventually
      kicks in, performs an overwrite of the underlying shared block and
      causes the associated data corruption.
      
      Update xfs_reflink_reserve_cow() to accommodate the fact that a
      delalloc allocation request may not fully cover the extent in the
      data fork. Trim the data fork extent appropriately, just as is done
      for shared extent boundaries and/or existing COW reservations that
      happen to overlap the start of the data fork extent. This prevents
      shared/010 failures due to data corruption on reflink enabled
      filesystems.
      Signed-off-by: NBrian Foster <bfoster@redhat.com>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c3a66bf4
    • D
      xfs: fix overflow in xfs_attr3_leaf_verify · a96f3a55
      Dave Chinner 提交于
      commit 837514f7a4ca4aca06aec5caa5ff56d33ef06976 upstream.
      
      generic/070 on 64k block size filesystems is failing with a verifier
      corruption on writeback or an attribute leaf block:
      
      [   94.973083] XFS (pmem0): Metadata corruption detected at xfs_attr3_leaf_verify+0x246/0x260, xfs_attr3_leaf block 0x811480
      [   94.975623] XFS (pmem0): Unmount and run xfs_repair
      [   94.976720] XFS (pmem0): First 128 bytes of corrupted metadata buffer:
      [   94.978270] 000000004b2e7b45: 00 00 00 00 00 00 00 00 3b ee 00 00 00 00 00 00  ........;.......
      [   94.980268] 000000006b1db90b: 00 00 00 00 00 81 14 80 00 00 00 00 00 00 00 00  ................
      [   94.982251] 00000000433f2407: 22 7b 5c 82 2d 5c 47 4c bb 31 1c 37 fa a9 ce d6  "{\.-\GL.1.7....
      [   94.984157] 0000000010dc7dfb: 00 00 00 00 00 81 04 8a 00 0a 18 e8 dd 94 01 00  ................
      [   94.986215] 00000000d5a19229: 00 a0 dc f4 fe 98 01 68 f0 d8 07 e0 00 00 00 00  .......h........
      [   94.988171] 00000000521df36c: 0c 2d 32 e2 fe 20 01 00 0c 2d 58 65 fe 0c 01 00  .-2.. ...-Xe....
      [   94.990162] 000000008477ae06: 0c 2d 5b 66 fe 8c 01 00 0c 2d 71 35 fe 7c 01 00  .-[f.....-q5.|..
      [   94.992139] 00000000a4a6bca6: 0c 2d 72 37 fc d4 01 00 0c 2d d8 b8 f0 90 01 00  .-r7.....-......
      [   94.994789] XFS (pmem0): xfs_do_force_shutdown(0x8) called from line 1453 of file fs/xfs/xfs_buf.c. Return address = ffffffff815365f3
      
      This is failing this check:
      
                      end = ichdr.freemap[i].base + ichdr.freemap[i].size;
                      if (end < ichdr.freemap[i].base)
      >>>>>                   return __this_address;
                      if (end > mp->m_attr_geo->blksize)
                              return __this_address;
      
      And from the buffer output above, the freemap array is:
      
      	freemap[0].base = 0x00a0
      	freemap[0].size = 0xdcf4	end = 0xdd94
      	freemap[1].base = 0xfe98
      	freemap[1].size = 0x0168	end = 0x10000
      	freemap[2].base = 0xf0d8
      	freemap[2].size = 0x07e0	end = 0xf8b8
      
      These all look valid - the block size is 0x10000 and so from the
      last check in the above verifier fragment we know that the end
      of freemap[1] is valid. The problem is that end is declared as:
      
      	uint16_t	end;
      
      And (uint16_t)0x10000 = 0. So we have a verifier bug here, not a
      corruption. Fix the verifier to use uint32_t types for the check and
      hence avoid the overflow.
      
      Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=201577Signed-off-by: NDave Chinner <dchinner@redhat.com>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      a96f3a55
    • C
      xfs: Fix error code in 'xfs_ioc_getbmap()' · b6095cbd
      Christophe JAILLET 提交于
      commit 132bf6723749f7219c399831eeb286dbbb985429 upstream.
      
      In this function, once 'buf' has been allocated, we unconditionally
      return 0.
      However, 'error' is set to some error codes in several error handling
      paths.
      Before commit 232b5194 ("xfs: simplify the xfs_getbmap interface")
      this was not an issue because all error paths were returning directly,
      but now that some cleanup at the end may be needed, we must propagate the
      error code.
      
      Fixes: 232b5194 ("xfs: simplify the xfs_getbmap interface")
      Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Reviewed-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b6095cbd
    • C
      xfs: cancel COW blocks before swapext · a585ac0e
      Christoph Hellwig 提交于
      commit 96987eea537d6ccd98704a71958f9ba02da80843 upstream.
      
      We need to make sure we have no outstanding COW blocks before we swap
      extents, as there is nothing preventing us from having preallocated COW
      delalloc on either inode that swapext is called on.  That case can
      easily be reproduced by running generic/324 in always_cow mode:
      
      [  620.760572] XFS: Assertion failed: tip->i_delayed_blks == 0, file: fs/xfs/xfs_bmap_util.c, line: 1669
      [  620.761608] ------------[ cut here ]------------
      [  620.762171] kernel BUG at fs/xfs/xfs_message.c:102!
      [  620.762732] invalid opcode: 0000 [#1] SMP PTI
      [  620.763272] CPU: 0 PID: 24153 Comm: xfs_fsr Tainted: G        W         4.19.0-rc1+ #4182
      [  620.764203] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
      [  620.765202] RIP: 0010:assfail+0x20/0x28
      [  620.765646] Code: 31 ff e8 83 fc ff ff 0f 0b c3 48 89 f1 41 89 d0 48 c7 c6 48 ca 8d 82 48 89 fa 38
      [  620.767758] RSP: 0018:ffffc9000898bc10 EFLAGS: 00010202
      [  620.768359] RAX: 0000000000000000 RBX: ffff88012f14ba40 RCX: 0000000000000000
      [  620.769174] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff828560d9
      [  620.769982] RBP: ffff88012f14b300 R08: 0000000000000000 R09: 0000000000000000
      [  620.770788] R10: 000000000000000a R11: f000000000000000 R12: ffffc9000898bc98
      [  620.771638] R13: ffffc9000898bc9c R14: ffff880130b5e2b8 R15: ffff88012a1fa2a8
      [  620.772504] FS:  00007fdc36e0fbc0(0000) GS:ffff88013ba00000(0000) knlGS:0000000000000000
      [  620.773475] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  620.774168] CR2: 00007fdc3604d000 CR3: 0000000132afc000 CR4: 00000000000006f0
      [  620.774978] Call Trace:
      [  620.775274]  xfs_swap_extent_forks+0x2a0/0x2e0
      [  620.775792]  xfs_swap_extents+0x38b/0xab0
      [  620.776256]  xfs_ioc_swapext+0x121/0x140
      [  620.776709]  xfs_file_ioctl+0x328/0xc90
      [  620.777154]  ? rcu_read_lock_sched_held+0x50/0x60
      [  620.777694]  ? xfs_iunlock+0x233/0x260
      [  620.778127]  ? xfs_setattr_nonsize+0x3be/0x6a0
      [  620.778647]  do_vfs_ioctl+0x9d/0x680
      [  620.779071]  ? ksys_fchown+0x47/0x80
      [  620.779552]  ksys_ioctl+0x35/0x70
      [  620.780040]  __x64_sys_ioctl+0x11/0x20
      [  620.780530]  do_syscall_64+0x4b/0x190
      [  620.780927]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [  620.781467] RIP: 0033:0x7fdc364d0f07
      [  620.781900] Code: b3 66 90 48 8b 05 81 5f 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 28
      [  620.784044] RSP: 002b:00007ffe2a766038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [  620.784896] RAX: ffffffffffffffda RBX: 0000000000000025 RCX: 00007fdc364d0f07
      [  620.785667] RDX: 0000560296ca2fc0 RSI: 00000000c0c0586d RDI: 0000000000000005
      [  620.786398] RBP: 0000000000000025 R08: 0000000000001200 R09: 0000000000000000
      [  620.787283] R10: 0000000000000432 R11: 0000000000000246 R12: 0000000000000005
      [  620.788051] R13: 0000000000000000 R14: 0000000000001000 R15: 0000000000000006
      [  620.788927] Modules linked in:
      [  620.789340] ---[ end trace 9503b7417ffdbdb0 ]---
      [  620.790065] RIP: 0010:assfail+0x20/0x28
      [  620.790642] Code: 31 ff e8 83 fc ff ff 0f 0b c3 48 89 f1 41 89 d0 48 c7 c6 48 ca 8d 82 48 89 fa 38
      [  620.793038] RSP: 0018:ffffc9000898bc10 EFLAGS: 00010202
      [  620.793609] RAX: 0000000000000000 RBX: ffff88012f14ba40 RCX: 0000000000000000
      [  620.794317] RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffff828560d9
      [  620.795025] RBP: ffff88012f14b300 R08: 0000000000000000 R09: 0000000000000000
      [  620.795778] R10: 000000000000000a R11: f000000000000000 R12: ffffc9000898bc98
      [  620.796675] R13: ffffc9000898bc9c R14: ffff880130b5e2b8 R15: ffff88012a1fa2a8
      [  620.797782] FS:  00007fdc36e0fbc0(0000) GS:ffff88013ba00000(0000) knlGS:0000000000000000
      [  620.798908] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  620.799594] CR2: 00007fdc3604d000 CR3: 0000000132afc000 CR4: 00000000000006f0
      [  620.800424] Kernel panic - not syncing: Fatal exception
      [  620.801191] Kernel Offset: disabled
      [  620.801597] ---[ end Kernel panic - not syncing: Fatal exception ]---
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Reviewed-by: NDave Chinner <dchinner@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      a585ac0e
    • C
      xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat · 62c7c0a8
      Carlos Maiolino 提交于
      commit 41657e5507b13e963be906d5d874f4f02374fd5c upstream.
      
      The addition of FIBT, RMAP and REFCOUNT changed the offsets into
      __xfssats structure.
      
      This caused xqmstat_proc_show() to display garbage data via
      /proc/fs/xfs/xqmstat, once it relies on the offsets marked via macros.
      
      Fix it.
      
      Fixes: 00f4e4f9 xfs: add rmap btree stats infrastructure
      Fixes: aafc3c24 xfs: support the XFS_BTNUM_FINOBT free inode btree type
      Fixes: 46eeb521 xfs: introduce refcount btree definitions
      Signed-off-by: NCarlos Maiolino <cmaiolino@redhat.com>
      Reviewed-by: NEric Sandeen <sandeen@redhat.com>
      Signed-off-by: NDave Chinner <david@fromorbit.com>
      Signed-off-by: NLuis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      62c7c0a8
    • D
      scripts/gdb: fix lx-version string output · aacb2ab1
      Du Changbin 提交于
      [ Upstream commit b058809bfc8faeb7b7cae047666e23375a060059 ]
      
      A bug is present in GDB which causes early string termination when
      parsing variables.  This has been reported [0], but we should ensure
      that we can support at least basic printing of the core kernel strings.
      
      For current gdb version (has been tested with 7.3 and 8.1), 'lx-version'
      only prints one character.
      
        (gdb) lx-version
        L(gdb)
      
      This can be fixed by casting 'linux_banner' as (char *).
      
        (gdb) lx-version
        Linux version 4.19.0-rc1+ (changbin@acer) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #21 SMP Sat Sep 1 21:43:30 CST 2018
      
      [0] https://sourceware.org/bugzilla/show_bug.cgi?id=20077
      
      [kbingham@kernel.org: add detail to commit message]
      Link: http://lkml.kernel.org/r/20181111162035.8356-1-kieran.bingham@ideasonboard.com
      Fixes: 2d061d99 ("scripts/gdb: add version command")
      Signed-off-by: NDu Changbin <changbin.du@gmail.com>
      Signed-off-by: NKieran Bingham <kbingham@kernel.org>
      Acked-by: NJan Kiszka <jan.kiszka@siemens.com>
      Cc: Jan Kiszka <jan.kiszka@siemens.com>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Daniel Thompson <daniel.thompson@linaro.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      aacb2ab1
    • A
      kernel/kcov.c: mark write_comp_data() as notrace · 58e57bcb
      Anders Roxell 提交于
      [ Upstream commit 634724431607f6f46c495dfef801a1c8b44a96d9 ]
      
      Since __sanitizer_cov_trace_const_cmp4 is marked as notrace, the
      function called from __sanitizer_cov_trace_const_cmp4 shouldn't be
      traceable either.  ftrace_graph_caller() gets called every time func
      write_comp_data() gets called if it isn't marked 'notrace'.  This is the
      backtrace from gdb:
      
       #0  ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179
       #1  0xffffff8010201920 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:151
       #2  0xffffff8010439714 in write_comp_data (type=5, arg1=0, arg2=0, ip=18446743524224276596) at ../kernel/kcov.c:116
       #3  0xffffff8010439894 in __sanitizer_cov_trace_const_cmp4 (arg1=<optimized out>, arg2=<optimized out>) at ../kernel/kcov.c:188
       #4  0xffffff8010201874 in prepare_ftrace_return (self_addr=18446743524226602768, parent=0xffffff801014b918, frame_pointer=18446743524223531344) at ./include/generated/atomic-instrumented.h:27
       #5  0xffffff801020194c in ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:182
      
      Rework so that write_comp_data() that are called from
      __sanitizer_cov_trace_*_cmp*() are marked as 'notrace'.
      
      Commit 903e8ff86753 ("kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace")
      missed to mark write_comp_data() as 'notrace'. When that patch was
      created gcc-7 was used. In lib/Kconfig.debug
      config KCOV_ENABLE_COMPARISONS
      	depends on $(cc-option,-fsanitize-coverage=trace-cmp)
      
      That code path isn't hit with gcc-7. However, it were that with gcc-8.
      
      Link: http://lkml.kernel.org/r/20181206143011.23719-1-anders.roxell@linaro.orgSigned-off-by: NAnders Roxell <anders.roxell@linaro.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Co-developed-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      58e57bcb
    • O
      exec: load_script: don't blindly truncate shebang string · ab5f7407
      Oleg Nesterov 提交于
      [ Upstream commit 8099b047ecc431518b9bb6bdbba3549bbecdc343 ]
      
      load_script() simply truncates bprm->buf and this is very wrong if the
      length of shebang string exceeds BINPRM_BUF_SIZE-2.  This can silently
      truncate i_arg or (worse) we can execute the wrong binary if buf[2:126]
      happens to be the valid executable path.
      
      Change load_script() to return ENOEXEC if it can't find '\n' or zero in
      bprm->buf.  Note that '\0' can come from either
      prepare_binprm()->memset() or from kernel_read(), we do not care.
      
      Link: http://lkml.kernel.org/r/20181112160931.GA28463@redhat.comSigned-off-by: NOleg Nesterov <oleg@redhat.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Ben Woodard <woodard@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      ab5f7407
    • D
      fs/epoll: drop ovflist branch prediction · 9cb8f808
      Davidlohr Bueso 提交于
      [ Upstream commit 76699a67f3041ff4c7af6d6ee9be2bfbf1ffb671 ]
      
      The ep->ovflist is a secondary ready-list to temporarily store events
      that might occur when doing sproc without holding the ep->wq.lock.  This
      accounts for every time we check for ready events and also send events
      back to userspace; both callbacks, particularly the latter because of
      copy_to_user, can account for a non-trivial time.
      
      As such, the unlikely() check to see if the pointer is being used, seems
      both misleading and sub-optimal.  In fact, we go to an awful lot of
      trouble to sync both lists, and populating the ovflist is far from an
      uncommon scenario.
      
      For example, profiling a concurrent epoll_wait(2) benchmark, with
      CONFIG_PROFILE_ANNOTATED_BRANCHES shows that for a two threads a 33%
      incorrect rate was seen; and when incrementally increasing the number of
      epoll instances (which is used, for example for multiple queuing load
      balancing models), up to a 90% incorrect rate was seen.
      
      Similarly, by deleting the prediction, 3% throughput boost was seen
      across incremental threads.
      
      Link: http://lkml.kernel.org/r/20181108051006.18751-4-dave@stgolabs.netSigned-off-by: NDavidlohr Bueso <dbueso@suse.de>
      Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Jason Baron <jbaron@akamai.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      9cb8f808
    • L
      kernel/hung_task.c: force console verbose before panic · f0d32c54
      Liu, Chuansheng 提交于
      [ Upstream commit 168e06f7937d96c7222037d8a05565e8a6eb00fe ]
      
      Based on commit 401c636a ("kernel/hung_task.c: show all hung tasks
      before panic"), we could get the call stack of hung task.
      
      However, if the console loglevel is not high, we still can not see the
      useful panic information in practice, and in most cases users don't set
      console loglevel to high level.
      
      This patch is to force console verbose before system panic, so that the
      real useful information can be seen in the console, instead of being
      like the following, which doesn't have hung task information.
      
        INFO: task init:1 blocked for more than 120 seconds.
              Tainted: G     U  W         4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1
        "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
        Kernel panic - not syncing: hung_task: blocked tasks
        CPU: 2 PID: 479 Comm: khungtaskd Tainted: G     U  W         4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1
        Call Trace:
         dump_stack+0x4f/0x65
         panic+0xde/0x231
         watchdog+0x290/0x410
         kthread+0x12c/0x150
         ret_from_fork+0x35/0x40
        reboot: panic mode set: p,w
        Kernel Offset: 0x34000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      
      Link: http://lkml.kernel.org/r/27240C0AC20F114CBF8149A2696CBE4A6015B675@SHSMSX101.ccr.corp.intel.comSigned-off-by: NChuansheng Liu <chuansheng.liu@intel.com>
      Reviewed-by: NPetr Mladek <pmladek@suse.com>
      Reviewed-by: NSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f0d32c54
    • C
      proc/sysctl: fix return error for proc_doulongvec_minmax() · 9beb84c0
      Cheng Lin 提交于
      [ Upstream commit 09be178400829dddc1189b50a7888495dd26aa84 ]
      
      If the number of input parameters is less than the total parameters, an
      EINVAL error will be returned.
      
      For example, we use proc_doulongvec_minmax to pass up to two parameters
      with kern_table:
      
      {
      	.procname       = "monitor_signals",
      	.data           = &monitor_sigs,
      	.maxlen         = 2*sizeof(unsigned long),
      	.mode           = 0644,
      	.proc_handler   = proc_doulongvec_minmax,
      },
      
      Reproduce:
      
      When passing two parameters, it's work normal.  But passing only one
      parameter, an error "Invalid argument"(EINVAL) is returned.
      
        [root@cl150 ~]# echo 1 2 > /proc/sys/kernel/monitor_signals
        [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
        1       2
        [root@cl150 ~]# echo 3 > /proc/sys/kernel/monitor_signals
        -bash: echo: write error: Invalid argument
        [root@cl150 ~]# echo $?
        1
        [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
        3       2
        [root@cl150 ~]#
      
      The following is the result after apply this patch.  No error is
      returned when the number of input parameters is less than the total
      parameters.
      
        [root@cl150 ~]# echo 1 2 > /proc/sys/kernel/monitor_signals
        [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
        1       2
        [root@cl150 ~]# echo 3 > /proc/sys/kernel/monitor_signals
        [root@cl150 ~]# echo $?
        0
        [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
        3       2
        [root@cl150 ~]#
      
      There are three processing functions dealing with digital parameters,
      __do_proc_dointvec/__do_proc_douintvec/__do_proc_doulongvec_minmax.
      
      This patch deals with __do_proc_doulongvec_minmax, just as
      __do_proc_dointvec does, adding a check for parameters 'left'.  In
      __do_proc_douintvec, its code implementation explicitly does not support
      multiple inputs.
      
      static int __do_proc_douintvec(...){
               ...
               /*
                * Arrays are not supported, keep this simple. *Do not* add
                * support for them.
                */
               if (vleft != 1) {
                       *lenp = 0;
                       return -EINVAL;
               }
               ...
      }
      
      So, just __do_proc_doulongvec_minmax has the problem.  And most use of
      proc_doulongvec_minmax/proc_doulongvec_ms_jiffies_minmax just have one
      parameter.
      
      Link: http://lkml.kernel.org/r/1544081775-15720-1-git-send-email-cheng.lin130@zte.com.cnSigned-off-by: NCheng Lin <cheng.lin130@zte.com.cn>
      Acked-by: NLuis Chamberlain <mcgrof@kernel.org>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      9beb84c0
    • T
      kernel/hung_task.c: break RCU locks based on jiffies · 9c8939b0
      Tetsuo Handa 提交于
      [ Upstream commit 304ae42739b108305f8d7b3eb3c1aec7c2b643a9 ]
      
      check_hung_uninterruptible_tasks() is currently calling rcu_lock_break()
      for every 1024 threads.  But check_hung_task() is very slow if printk()
      was called, and is very fast otherwise.
      
      If many threads within some 1024 threads called printk(), the RCU grace
      period might be extended enough to trigger RCU stall warnings.
      Therefore, calling rcu_lock_break() for every some fixed jiffies will be
      safer.
      
      Link: http://lkml.kernel.org/r/1544800658-11423-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jpSigned-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: NPaul E. McKenney <paulmck@linux.ibm.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      9c8939b0
    • D
      arm64/sve: ptrace: Fix SVE_PT_REGS_OFFSET definition · d69ad39a
      Dave Martin 提交于
      [ Upstream commit ee1b465b303591d3a04d403122bbc0d7026520fb ]
      
      SVE_PT_REGS_OFFSET is supposed to indicate the offset for skipping
      over the ptrace NT_ARM_SVE header (struct user_sve_header) to the
      start of the SVE register data proper.
      
      However, currently SVE_PT_REGS_OFFSET is defined in terms of struct
      sve_context, which is wrong: that structure describes the SVE
      header in the signal frame, not in the ptrace regset.
      
      This patch fixes the definition to use the ptrace header structure
      struct user_sve_header instead.
      
      By good fortune, the two structures are the same size anyway, so
      there is no functional or ABI change.
      Signed-off-by: NDave Martin <Dave.Martin@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d69ad39a
    • A
      HID: lenovo: Add checks to fix of_led_classdev_register · d921bb16
      Aditya Pakki 提交于
      [ Upstream commit 6ae16dfb61bce538d48b7fe98160fada446056c5 ]
      
      In lenovo_probe_tpkbd(), the function of_led_classdev_register() could
      return an error value that is unchecked. The fix adds these checks.
      Signed-off-by: NAditya Pakki <pakki001@umn.edu>
      Signed-off-by: NJiri Kosina <jkosina@suse.cz>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d921bb16