1. 21 12月, 2019 13 次提交
    • G
      tcp: tighten acceptance of ACKs not matching a child socket · 4b8a9869
      Guillaume Nault 提交于
      [ Upstream commit cb44a08f8647fd2e8db5cc9ac27cd8355fa392d8 ]
      
      When no synflood occurs, the synflood timestamp isn't updated.
      Therefore it can be so old that time_after32() can consider it to be
      in the future.
      
      That's a problem for tcp_synq_no_recent_overflow() as it may report
      that a recent overflow occurred while, in fact, it's just that jiffies
      has grown past 'last_overflow' + TCP_SYNCOOKIE_VALID + 2^31.
      
      Spurious detection of recent overflows lead to extra syncookie
      verification in cookie_v[46]_check(). At that point, the verification
      should fail and the packet dropped. But we should have dropped the
      packet earlier as we didn't even send a syncookie.
      
      Let's refine tcp_synq_no_recent_overflow() to report a recent overflow
      only if jiffies is within the
      [last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval. This
      way, no spurious recent overflow is reported when jiffies wraps and
      'last_overflow' becomes in the future from the point of view of
      time_after32().
      
      However, if jiffies wraps and enters the
      [last_overflow, last_overflow + TCP_SYNCOOKIE_VALID] interval (with
      'last_overflow' being a stale synflood timestamp), then
      tcp_synq_no_recent_overflow() still erroneously reports an
      overflow. In such cases, we have to rely on syncookie verification
      to drop the packet. We unfortunately have no way to differentiate
      between a fresh and a stale syncookie timestamp.
      
      In practice, using last_overflow as lower bound is problematic.
      If the synflood timestamp is concurrently updated between the time
      we read jiffies and the moment we store the timestamp in
      'last_overflow', then 'now' becomes smaller than 'last_overflow' and
      tcp_synq_no_recent_overflow() returns true, potentially dropping a
      valid syncookie.
      
      Reading jiffies after loading the timestamp could fix the problem,
      but that'd require a memory barrier. Let's just accommodate for
      potential timestamp growth instead and extend the interval using
      'last_overflow - HZ' as lower bound.
      Signed-off-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4b8a9869
    • G
      tcp: fix rejected syncookies due to stale timestamps · bac9e8f3
      Guillaume Nault 提交于
      [ Upstream commit 04d26e7b159a396372646a480f4caa166d1b6720 ]
      
      If no synflood happens for a long enough period of time, then the
      synflood timestamp isn't refreshed and jiffies can advance so much
      that time_after32() can't accurately compare them any more.
      
      Therefore, we can end up in a situation where time_after32(now,
      last_overflow + HZ) returns false, just because these two values are
      too far apart. In that case, the synflood timestamp isn't updated as
      it should be, which can trick tcp_synq_no_recent_overflow() into
      rejecting valid syncookies.
      
      For example, let's consider the following scenario on a system
      with HZ=1000:
      
        * The synflood timestamp is 0, either because that's the timestamp
          of the last synflood or, more commonly, because we're working with
          a freshly created socket.
      
        * We receive a new SYN, which triggers synflood protection. Let's say
          that this happens when jiffies == 2147484649 (that is,
          'synflood timestamp' + HZ + 2^31 + 1).
      
        * Then tcp_synq_overflow() doesn't update the synflood timestamp,
          because time_after32(2147484649, 1000) returns false.
          With:
            - 2147484649: the value of jiffies, aka. 'now'.
            - 1000: the value of 'last_overflow' + HZ.
      
        * A bit later, we receive the ACK completing the 3WHS. But
          cookie_v[46]_check() rejects it because tcp_synq_no_recent_overflow()
          says that we're not under synflood. That's because
          time_after32(2147484649, 120000) returns false.
          With:
            - 2147484649: the value of jiffies, aka. 'now'.
            - 120000: the value of 'last_overflow' + TCP_SYNCOOKIE_VALID.
      
          Of course, in reality jiffies would have increased a bit, but this
          condition will last for the next 119 seconds, which is far enough
          to accommodate for jiffie's growth.
      
      Fix this by updating the overflow timestamp whenever jiffies isn't
      within the [last_overflow, last_overflow + HZ] range. That shouldn't
      have any performance impact since the update still happens at most once
      per second.
      
      Now we're guaranteed to have fresh timestamps while under synflood, so
      tcp_synq_no_recent_overflow() can safely use it with time_after32() in
      such situations.
      
      Stale timestamps can still make tcp_synq_no_recent_overflow() return
      the wrong verdict when not under synflood. This will be handled in the
      next patch.
      
      For 64 bits architectures, the problem was introduced with the
      conversion of ->tw_ts_recent_stamp to 32 bits integer by commit
      cca9bab1 ("tcp: use monotonic timestamps for PAWS").
      The problem has always been there on 32 bits architectures.
      
      Fixes: cca9bab1 ("tcp: use monotonic timestamps for PAWS")
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: NGuillaume Nault <gnault@redhat.com>
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bac9e8f3
    • H
      net/mlx5e: Query global pause state before setting prio2buffer · c5fc25e6
      Huy Nguyen 提交于
      [ Upstream commit 73e6551699a32fac703ceea09214d6580edcf2d5 ]
      
      When the user changes prio2buffer mapping while global pause is
      enabled, mlx5 driver incorrectly sets all active buffers
      (buffer that has at least one priority mapped) to lossy.
      
      Solution:
      If global pause is enabled, set all the active buffers to lossless
      in prio2buffer command.
      Also, add error message when buffer size is not enough to meet
      xoff threshold.
      
      Fixes: 0696d608 ("net/mlx5e: Receive buffer configuration")
      Signed-off-by: NHuy Nguyen <huyn@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5fc25e6
    • T
      tipc: fix ordering of tipc module init and exit routine · 9430afbc
      Taehee Yoo 提交于
      [ Upstream commit 9cf1cd8ee3ee09ef2859017df2058e2f53c5347f ]
      
      In order to set/get/dump, the tipc uses the generic netlink
      infrastructure. So, when tipc module is inserted, init function
      calls genl_register_family().
      After genl_register_family(), set/get/dump commands are immediately
      allowed and these callbacks internally use the net_generic.
      net_generic is allocated by register_pernet_device() but this
      is called after genl_register_family() in the __init function.
      So, these callbacks would use un-initialized net_generic.
      
      Test commands:
          #SHELL1
          while :
          do
              modprobe tipc
              modprobe -rv tipc
          done
      
          #SHELL2
          while :
          do
              tipc link list
          done
      
      Splat looks like:
      [   59.616322][ T2788] kasan: CONFIG_KASAN_INLINE enabled
      [   59.617234][ T2788] kasan: GPF could be caused by NULL-ptr deref or user memory access
      [   59.618398][ T2788] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [   59.619389][ T2788] CPU: 3 PID: 2788 Comm: tipc Not tainted 5.4.0+ #194
      [   59.620231][ T2788] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
      [   59.621428][ T2788] RIP: 0010:tipc_bcast_get_broadcast_mode+0x131/0x310 [tipc]
      [   59.622379][ T2788] Code: c7 c6 ef 8b 38 c0 65 ff 0d 84 83 c9 3f e8 d7 a5 f2 e3 48 8d bb 38 11 00 00 48 b8 00 00 00 00
      [   59.622550][ T2780] NET: Registered protocol family 30
      [   59.624627][ T2788] RSP: 0018:ffff88804b09f578 EFLAGS: 00010202
      [   59.624630][ T2788] RAX: dffffc0000000000 RBX: 0000000000000011 RCX: 000000008bc66907
      [   59.624631][ T2788] RDX: 0000000000000229 RSI: 000000004b3cf4cc RDI: 0000000000001149
      [   59.624633][ T2788] RBP: ffff88804b09f588 R08: 0000000000000003 R09: fffffbfff4fb3df1
      [   59.624635][ T2788] R10: fffffbfff50318f8 R11: ffff888066cadc18 R12: ffffffffa6cc2f40
      [   59.624637][ T2788] R13: 1ffff11009613eba R14: ffff8880662e9328 R15: ffff8880662e9328
      [   59.624639][ T2788] FS:  00007f57d8f7b740(0000) GS:ffff88806cc00000(0000) knlGS:0000000000000000
      [   59.624645][ T2788] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   59.625875][ T2780] tipc: Started in single node mode
      [   59.626128][ T2788] CR2: 00007f57d887a8c0 CR3: 000000004b140002 CR4: 00000000000606e0
      [   59.633991][ T2788] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   59.635195][ T2788] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   59.636478][ T2788] Call Trace:
      [   59.637025][ T2788]  tipc_nl_add_bc_link+0x179/0x1470 [tipc]
      [   59.638219][ T2788]  ? lock_downgrade+0x6e0/0x6e0
      [   59.638923][ T2788]  ? __tipc_nl_add_link+0xf90/0xf90 [tipc]
      [   59.639533][ T2788]  ? tipc_nl_node_dump_link+0x318/0xa50 [tipc]
      [   59.640160][ T2788]  ? mutex_lock_io_nested+0x1380/0x1380
      [   59.640746][ T2788]  tipc_nl_node_dump_link+0x4fd/0xa50 [tipc]
      [   59.641356][ T2788]  ? tipc_nl_node_reset_link_stats+0x340/0x340 [tipc]
      [   59.642088][ T2788]  ? __skb_ext_del+0x270/0x270
      [   59.642594][ T2788]  genl_lock_dumpit+0x85/0xb0
      [   59.643050][ T2788]  netlink_dump+0x49c/0xed0
      [   59.643529][ T2788]  ? __netlink_sendskb+0xc0/0xc0
      [   59.644044][ T2788]  ? __netlink_dump_start+0x190/0x800
      [   59.644617][ T2788]  ? __mutex_unlock_slowpath+0xd0/0x670
      [   59.645177][ T2788]  __netlink_dump_start+0x5a0/0x800
      [   59.645692][ T2788]  genl_rcv_msg+0xa75/0xe90
      [   59.646144][ T2788]  ? __lock_acquire+0xdfe/0x3de0
      [   59.646692][ T2788]  ? genl_family_rcv_msg_attrs_parse+0x320/0x320
      [   59.647340][ T2788]  ? genl_lock_dumpit+0xb0/0xb0
      [   59.647821][ T2788]  ? genl_unlock+0x20/0x20
      [   59.648290][ T2788]  ? genl_parallel_done+0xe0/0xe0
      [   59.648787][ T2788]  ? find_held_lock+0x39/0x1d0
      [   59.649276][ T2788]  ? genl_rcv+0x15/0x40
      [   59.649722][ T2788]  ? lock_contended+0xcd0/0xcd0
      [   59.650296][ T2788]  netlink_rcv_skb+0x121/0x350
      [   59.650828][ T2788]  ? genl_family_rcv_msg_attrs_parse+0x320/0x320
      [   59.651491][ T2788]  ? netlink_ack+0x940/0x940
      [   59.651953][ T2788]  ? lock_acquire+0x164/0x3b0
      [   59.652449][ T2788]  genl_rcv+0x24/0x40
      [   59.652841][ T2788]  netlink_unicast+0x421/0x600
      [ ... ]
      
      Fixes: 7e436905 ("tipc: fix a slab object leak")
      Fixes: a62fbcce ("tipc: make subscriber server support net namespace")
      Signed-off-by: NTaehee Yoo <ap420073@gmail.com>
      Acked-by: NJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9430afbc
    • E
      tcp: md5: fix potential overestimation of TCP option space · a148815a
      Eric Dumazet 提交于
      [ Upstream commit 9424e2e7ad93ffffa88f882c9bc5023570904b55 ]
      
      Back in 2008, Adam Langley fixed the corner case of packets for flows
      having all of the following options : MD5 TS SACK
      
      Since MD5 needs 20 bytes, and TS needs 12 bytes, no sack block
      can be cooked from the remaining 8 bytes.
      
      tcp_established_options() correctly sets opts->num_sack_blocks
      to zero, but returns 36 instead of 32.
      
      This means TCP cooks packets with 4 extra bytes at the end
      of options, containing unitialized bytes.
      
      Fixes: 33ad798c ("tcp: options clean up")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Acked-by: NNeal Cardwell <ncardwell@google.com>
      Acked-by: NSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a148815a
    • A
      openvswitch: support asymmetric conntrack · 6f99afcc
      Aaron Conole 提交于
      [ Upstream commit 5d50aa83e2c8e91ced2cca77c198b468ca9210f4 ]
      
      The openvswitch module shares a common conntrack and NAT infrastructure
      exposed via netfilter.  It's possible that a packet needs both SNAT and
      DNAT manipulation, due to e.g. tuple collision.  Netfilter can support
      this because it runs through the NAT table twice - once on ingress and
      again after egress.  The openvswitch module doesn't have such capability.
      
      Like netfilter hook infrastructure, we should run through NAT twice to
      keep the symmetry.
      
      Fixes: 05752523 ("openvswitch: Interface with NAT.")
      Signed-off-by: NAaron Conole <aconole@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f99afcc
    • M
      net: thunderx: start phy before starting autonegotiation · 13156081
      Mian Yousaf Kaukab 提交于
      [ Upstream commit a350d2e7adbb57181d33e3aa6f0565632747feaa ]
      
      Since commit 2b3e88ea6528 ("net: phy: improve phy state checking")
      phy_start_aneg() expects phy state to be >= PHY_UP. Call phy_start()
      before calling phy_start_aneg() during probe so that autonegotiation
      is initiated.
      
      As phy_start() takes care of calling phy_start_aneg(), drop the explicit
      call to phy_start_aneg().
      
      Network fails without this patch on Octeon TX.
      
      Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
      Signed-off-by: NMian Yousaf Kaukab <ykaukab@suse.de>
      Reviewed-by: NAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13156081
    • D
      net: sched: fix dump qlen for sch_mq/sch_mqprio with NOLOCK subqueues · 0c5a4dd6
      Dust Li 提交于
      [ Upstream commit 2f23cd42e19c22c24ff0e221089b7b6123b117c5 ]
      
      sch->q.len hasn't been set if the subqueue is a NOLOCK qdisc
       in mq_dump() and mqprio_dump().
      
      Fixes: ce679e8d ("net: sched: add support for TCQ_F_NOLOCK subqueues to sch_mqprio")
      Signed-off-by: NDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: NTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c5a4dd6
    • G
      net: ethernet: ti: cpsw: fix extra rx interrupt · 64334e4f
      Grygorii Strashko 提交于
      [ Upstream commit 51302f77bedab8768b761ed1899c08f89af9e4e2 ]
      
      Now RX interrupt is triggered twice every time, because in
      cpsw_rx_interrupt() it is asked first and then disabled. So there will be
      pending interrupt always, when RX interrupt is enabled again in NAPI
      handler.
      
      Fix it by first disabling IRQ and then do ask.
      
      Fixes: 870915fe ("drivers: net: cpsw: remove disable_irq/enable_irq as irq can be masked from cpsw itself")
      Signed-off-by: NGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64334e4f
    • A
      net: dsa: fix flow dissection on Tx path · a7d80e75
      Alexander Lobakin 提交于
      [ Upstream commit 8bef0af09a5415df761b04fa487a6c34acae74bc ]
      
      Commit 43e66528 ("net-next: dsa: fix flow dissection") added an
      ability to override protocol and network offset during flow dissection
      for DSA-enabled devices (i.e. controllers shipped as switch CPU ports)
      in order to fix skb hashing for RPS on Rx path.
      
      However, skb_hash() and added part of code can be invoked not only on
      Rx, but also on Tx path if we have a multi-queued device and:
       - kernel is running on UP system or
       - XPS is not configured.
      
      The call stack in this two cases will be like: dev_queue_xmit() ->
      __dev_queue_xmit() -> netdev_core_pick_tx() -> netdev_pick_tx() ->
      skb_tx_hash() -> skb_get_hash().
      
      The problem is that skbs queued for Tx have both network offset and
      correct protocol already set up even after inserting a CPU tag by DSA
      tagger, so calling tag_ops->flow_dissect() on this path actually only
      breaks flow dissection and hashing.
      
      This can be observed by adding debug prints just before and right after
      tag_ops->flow_dissect() call to the related block of code:
      
      Before the patch:
      
      Rx path (RPS):
      
      [   19.240001] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   19.244271] tag_ops->flow_dissect()
      [   19.247811] Rx: proto: 0x0800, nhoff: 8	/* ETH_P_IP */
      
      [   19.215435] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   19.219746] tag_ops->flow_dissect()
      [   19.223241] Rx: proto: 0x0806, nhoff: 8	/* ETH_P_ARP */
      
      [   18.654057] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   18.658332] tag_ops->flow_dissect()
      [   18.661826] Rx: proto: 0x8100, nhoff: 8	/* ETH_P_8021Q */
      
      Tx path (UP system):
      
      [   18.759560] Tx: proto: 0x0800, nhoff: 26	/* ETH_P_IP */
      [   18.763933] tag_ops->flow_dissect()
      [   18.767485] Tx: proto: 0x920b, nhoff: 34	/* junk */
      
      [   22.800020] Tx: proto: 0x0806, nhoff: 26	/* ETH_P_ARP */
      [   22.804392] tag_ops->flow_dissect()
      [   22.807921] Tx: proto: 0x920b, nhoff: 34	/* junk */
      
      [   16.898342] Tx: proto: 0x86dd, nhoff: 26	/* ETH_P_IPV6 */
      [   16.902705] tag_ops->flow_dissect()
      [   16.906227] Tx: proto: 0x920b, nhoff: 34	/* junk */
      
      After:
      
      Rx path (RPS):
      
      [   16.520993] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   16.525260] tag_ops->flow_dissect()
      [   16.528808] Rx: proto: 0x0800, nhoff: 8	/* ETH_P_IP */
      
      [   15.484807] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   15.490417] tag_ops->flow_dissect()
      [   15.495223] Rx: proto: 0x0806, nhoff: 8	/* ETH_P_ARP */
      
      [   17.134621] Rx: proto: 0x00f8, nhoff: 0	/* ETH_P_XDSA */
      [   17.138895] tag_ops->flow_dissect()
      [   17.142388] Rx: proto: 0x8100, nhoff: 8	/* ETH_P_8021Q */
      
      Tx path (UP system):
      
      [   15.499558] Tx: proto: 0x0800, nhoff: 26	/* ETH_P_IP */
      
      [   20.664689] Tx: proto: 0x0806, nhoff: 26	/* ETH_P_ARP */
      
      [   18.565782] Tx: proto: 0x86dd, nhoff: 26	/* ETH_P_IPV6 */
      
      In order to fix that we can add the check 'proto == htons(ETH_P_XDSA)'
      to prevent code from calling tag_ops->flow_dissect() on Tx.
      I also decided to initialize 'offset' variable so tagger callbacks can
      now safely leave it untouched without provoking a chaos.
      
      Fixes: 43e66528 ("net-next: dsa: fix flow dissection")
      Signed-off-by: NAlexander Lobakin <alobakin@dlink.ru>
      Reviewed-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7d80e75
    • N
      net: bridge: deny dev_set_mac_address() when unregistering · bb168ebe
      Nikolay Aleksandrov 提交于
      [ Upstream commit c4b4c421857dc7b1cf0dccbd738472360ff2cd70 ]
      
      We have an interesting memory leak in the bridge when it is being
      unregistered and is a slave to a master device which would change the
      mac of its slaves on unregister (e.g. bond, team). This is a very
      unusual setup but we do end up leaking 1 fdb entry because
      dev_set_mac_address() would cause the bridge to insert the new mac address
      into its table after all fdbs are flushed, i.e. after dellink() on the
      bridge has finished and we call NETDEV_UNREGISTER the bond/team would
      release it and will call dev_set_mac_address() to restore its original
      address and that in turn will add an fdb in the bridge.
      One fix is to check for the bridge dev's reg_state in its
      ndo_set_mac_address callback and return an error if the bridge is not in
      NETREG_REGISTERED.
      
      Easy steps to reproduce:
       1. add bond in mode != A/B
       2. add any slave to the bond
       3. add bridge dev as a slave to the bond
       4. destroy the bridge device
      
      Trace:
       unreferenced object 0xffff888035c4d080 (size 128):
         comm "ip", pid 4068, jiffies 4296209429 (age 1413.753s)
         hex dump (first 32 bytes):
           41 1d c9 36 80 88 ff ff 00 00 00 00 00 00 00 00  A..6............
           d2 19 c9 5e 3f d7 00 00 00 00 00 00 00 00 00 00  ...^?...........
         backtrace:
           [<00000000ddb525dc>] kmem_cache_alloc+0x155/0x26f
           [<00000000633ff1e0>] fdb_create+0x21/0x486 [bridge]
           [<0000000092b17e9c>] fdb_insert+0x91/0xdc [bridge]
           [<00000000f2a0f0ff>] br_fdb_change_mac_address+0xb3/0x175 [bridge]
           [<000000001de02dbd>] br_stp_change_bridge_id+0xf/0xff [bridge]
           [<00000000ac0e32b1>] br_set_mac_address+0x76/0x99 [bridge]
           [<000000006846a77f>] dev_set_mac_address+0x63/0x9b
           [<00000000d30738fc>] __bond_release_one+0x3f6/0x455 [bonding]
           [<00000000fc7ec01d>] bond_netdev_event+0x2f2/0x400 [bonding]
           [<00000000305d7795>] notifier_call_chain+0x38/0x56
           [<0000000028885d4a>] call_netdevice_notifiers+0x1e/0x23
           [<000000008279477b>] rollback_registered_many+0x353/0x6a4
           [<0000000018ef753a>] unregister_netdevice_many+0x17/0x6f
           [<00000000ba854b7a>] rtnl_delete_link+0x3c/0x43
           [<00000000adf8618d>] rtnl_dellink+0x1dc/0x20a
           [<000000009b6395fd>] rtnetlink_rcv_msg+0x23d/0x268
      
      Fixes: 43598813 ("bridge: add local MAC address to forwarding table (v2)")
      Reported-by: syzbot+2add91c08eb181fea1bf@syzkaller.appspotmail.com
      Signed-off-by: NNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bb168ebe
    • V
      mqprio: Fix out-of-bounds access in mqprio_dump · 588fac83
      Vladyslav Tarasiuk 提交于
      [ Upstream commit 9f104c7736904ac72385bbb48669e0c923ca879b ]
      
      When user runs a command like
      tc qdisc add dev eth1 root mqprio
      KASAN stack-out-of-bounds warning is emitted.
      Currently, NLA_ALIGN macro used in mqprio_dump provides too large
      buffer size as argument for nla_put and memcpy down the call stack.
      The flow looks like this:
      1. nla_put expects exact object size as an argument;
      2. Later it provides this size to memcpy;
      3. To calculate correct padding for SKB, nla_put applies NLA_ALIGN
         macro itself.
      
      Therefore, NLA_ALIGN should not be applied to the nla_put parameter.
      Otherwise it will lead to out-of-bounds memory access in memcpy.
      
      Fixes: 4e8b86c0 ("mqprio: Introduce new hardware offload mode and shaper in mqprio")
      Signed-off-by: NVladyslav Tarasiuk <vladyslavt@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      588fac83
    • E
      inet: protect against too small mtu values. · d80d67cd
      Eric Dumazet 提交于
      [ Upstream commit 501a90c945103e8627406763dac418f20f3837b2 ]
      
      syzbot was once again able to crash a host by setting a very small mtu
      on loopback device.
      
      Let's make inetdev_valid_mtu() available in include/net/ip.h,
      and use it in ip_setup_cork(), so that we protect both ip_append_page()
      and __ip_append_data()
      
      Also add a READ_ONCE() when the device mtu is read.
      
      Pairs this lockless read with one WRITE_ONCE() in __dev_set_mtu(),
      even if other code paths might write over this field.
      
      Add a big comment in include/linux/netdevice.h about dev->mtu
      needing READ_ONCE()/WRITE_ONCE() annotations.
      
      Hopefully we will add the missing ones in followup patches.
      
      [1]
      
      refcount_t: saturated; leaking memory.
      WARNING: CPU: 0 PID: 9464 at lib/refcount.c:22 refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
      Kernel panic - not syncing: panic_on_warn set ...
      CPU: 0 PID: 9464 Comm: syz-executor850 Not tainted 5.4.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       panic+0x2e3/0x75c kernel/panic.c:221
       __warn.cold+0x2f/0x3e kernel/panic.c:582
       report_bug+0x289/0x300 lib/bug.c:195
       fixup_bug arch/x86/kernel/traps.c:174 [inline]
       fixup_bug arch/x86/kernel/traps.c:169 [inline]
       do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267
       do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286
       invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
      RIP: 0010:refcount_warn_saturate+0x138/0x1f0 lib/refcount.c:22
      Code: 06 31 ff 89 de e8 c8 f5 e6 fd 84 db 0f 85 6f ff ff ff e8 7b f4 e6 fd 48 c7 c7 e0 71 4f 88 c6 05 56 a6 a4 06 01 e8 c7 a8 b7 fd <0f> 0b e9 50 ff ff ff e8 5c f4 e6 fd 0f b6 1d 3d a6 a4 06 31 ff 89
      RSP: 0018:ffff88809689f550 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffffffff815e4336 RDI: ffffed1012d13e9c
      RBP: ffff88809689f560 R08: ffff88809c50a3c0 R09: fffffbfff15d31b1
      R10: fffffbfff15d31b0 R11: ffffffff8ae98d87 R12: 0000000000000001
      R13: 0000000000040100 R14: ffff888099041104 R15: ffff888218d96e40
       refcount_add include/linux/refcount.h:193 [inline]
       skb_set_owner_w+0x2b6/0x410 net/core/sock.c:1999
       sock_wmalloc+0xf1/0x120 net/core/sock.c:2096
       ip_append_page+0x7ef/0x1190 net/ipv4/ip_output.c:1383
       udp_sendpage+0x1c7/0x480 net/ipv4/udp.c:1276
       inet_sendpage+0xdb/0x150 net/ipv4/af_inet.c:821
       kernel_sendpage+0x92/0xf0 net/socket.c:3794
       sock_sendpage+0x8b/0xc0 net/socket.c:936
       pipe_to_sendpage+0x2da/0x3c0 fs/splice.c:458
       splice_from_pipe_feed fs/splice.c:512 [inline]
       __splice_from_pipe+0x3ee/0x7c0 fs/splice.c:636
       splice_from_pipe+0x108/0x170 fs/splice.c:671
       generic_splice_sendpage+0x3c/0x50 fs/splice.c:842
       do_splice_from fs/splice.c:861 [inline]
       direct_splice_actor+0x123/0x190 fs/splice.c:1035
       splice_direct_to_actor+0x3b4/0xa30 fs/splice.c:990
       do_splice_direct+0x1da/0x2a0 fs/splice.c:1078
       do_sendfile+0x597/0xd00 fs/read_write.c:1464
       __do_sys_sendfile64 fs/read_write.c:1525 [inline]
       __se_sys_sendfile64 fs/read_write.c:1511 [inline]
       __x64_sys_sendfile64+0x1dd/0x220 fs/read_write.c:1511
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x441409
      Code: e8 ac e8 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fffb64c4f78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441409
      RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000005
      RBP: 0000000000073b8a R08: 0000000000000010 R09: 0000000000000010
      R10: 0000000000010001 R11: 0000000000000246 R12: 0000000000402180
      R13: 0000000000402210 R14: 0000000000000000 R15: 0000000000000000
      Kernel Offset: disabled
      Rebooting in 86400 seconds..
      
      Fixes: 1470ddf7 ("inet: Remove explicit write references to sk/inet in ip_append_data")
      Signed-off-by: NEric Dumazet <edumazet@google.com>
      Reported-by: Nsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d80d67cd
  2. 18 12月, 2019 27 次提交
    • G
      Linux 4.19.90 · 7d120bf2
      Greg Kroah-Hartman 提交于
      7d120bf2
    • E
      of: unittest: fix memory leak in attach_node_and_children · b65a9b44
      Erhard Furtner 提交于
      [ Upstream commit 2aacace6dbbb6b6ce4e177e6c7ea901f389c0472 ]
      
      In attach_node_and_children memory is allocated for full_name via
      kasprintf. If the condition of the 1st if is not met the function
      returns early without freeing the memory. Add a kfree() to fix that.
      
      This has been detected with kmemleak:
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=205327
      
      It looks like the leak was introduced by this commit:
      Fixes: 5babefb7f7ab ("of: unittest: allow base devicetree to have symbol metadata")
      Signed-off-by: NErhard Furtner <erhard_f@mailbox.org>
      Reviewed-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: NTyrel Datwyler <tyreld@linux.ibm.com>
      Signed-off-by: NRob Herring <robh@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      b65a9b44
    • K
      scsi: zorro_esp: Limit DMA transfers to 65536 bytes (except on Fastlane) · e62b2baf
      Kars de Jong 提交于
      [ Upstream commit 02f7e9f351a9de95577eafdc3bd413ed1c3b589f ]
      
      When using this driver on a Blizzard 1260, there were failures whenever DMA
      transfers from the SCSI bus to memory of 65535 bytes were followed by a DMA
      transfer of 1 byte. This caused the byte at offset 65535 to be overwritten
      with 0xff. The Blizzard hardware can't handle single byte DMA transfers.
      
      Besides this issue, limiting the DMA length to something that is not a
      multiple of the page size is very inefficient on most file systems.
      
      It seems this limit was chosen because the DMA transfer counter of the ESP
      by default is 16 bits wide, thus limiting the length to 65535 bytes.
      However, the value 0 means 65536 bytes, which is handled by the ESP and the
      Blizzard just fine. It is also the default maximum used by esp_scsi when
      drivers don't provide their own dma_length_limit() function.
      
      The limit of 65536 bytes can be used by all boards except the Fastlane. The
      old driver used a limit of 65532 bytes (0xfffc), which is reintroduced in
      this patch.
      
      Fixes: b7ded0e8b0d1 ("scsi: zorro_esp: Limit DMA transfers to 65535 bytes")
      Link: https://lore.kernel.org/r/20191112175523.23145-1-jongk@linux-m68k.orgSigned-off-by: NKars de Jong <jongk@linux-m68k.org>
      Reviewed-by: NFinn Thain <fthain@telegraphics.com.au>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      e62b2baf
    • M
      idr: Fix idr_get_next_ul race with idr_remove · 0cec640d
      Matthew Wilcox (Oracle) 提交于
      [ Upstream commit 5a74ac4c4a97bd8b7dba054304d598e2a882fea6 ]
      
      Commit 5c089fd0c734 ("idr: Fix idr_get_next race with idr_remove")
      neglected to fix idr_get_next_ul().  As far as I can tell, nobody's
      actually using this interface under the RCU read lock, but fix it now
      before anybody decides to use it.
      
      Fixes: 5c089fd0c734 ("idr: Fix idr_get_next race with idr_remove")
      Signed-off-by: NMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      0cec640d
    • J
      iio: imu: mpu6050: add missing available scan masks · 052d878c
      Jean-Baptiste Maneyrol 提交于
      [ Upstream commit 1244a720572fd1680ac8d6b8a4235f2e8557b810 ]
      
      Driver only supports 3-axis gyro and/or 3-axis accel.
      For icm20602, temp data is mandatory for all configurations.
      
      Fix all single and double axis configurations (almost never used) and more
      importantly fix 3-axis gyro and 6-axis accel+gyro buffer on icm20602 when
      temp data is not enabled.
      Signed-off-by: NJean-Baptiste Maneyrol <jmaneyrol@invensense.com>
      Fixes: 1615fe41a195 ("iio: imu: mpu6050: Fix FIFO layout for ICM20602")
      Cc: <Stable@vger.kernel.org>
      Signed-off-by: NJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      052d878c
    • R
      scsi: qla2xxx: Change discovery state before PLOGI · 89f3ac7e
      Roman Bolshakov 提交于
      [ Upstream commit 58e39a2ce4be08162c0368030cdc405f7fd849aa ]
      
      When a port sends PLOGI, discovery state should be changed to login
      pending, otherwise RELOGIN_NEEDED bit is set in
      qla24xx_handle_plogi_done_event(). RELOGIN_NEEDED triggers another PLOGI,
      and it never goes out of the loop until login timer expires.
      
      Fixes: 8777e431 ("scsi: qla2xxx: Migrate NVME N2N handling into state machine")
      Fixes: 8b5292bcfcacf ("scsi: qla2xxx: Fix Relogin to prevent modifying scan_state flag")
      Cc: Quinn Tran <qutran@marvell.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20191125165702.1013-6-r.bolshakov@yadro.comAcked-by: NHimanshu Madhani <hmadhani@marvell.com>
      Reviewed-by: NHannes Reinecke <hare@suse.de>
      Tested-by: NHannes Reinecke <hare@suse.de>
      Signed-off-by: NRoman Bolshakov <r.bolshakov@yadro.com>
      Signed-off-by: NMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      89f3ac7e
    • G
      raid5: need to set STRIPE_HANDLE for batch head · a40982c7
      Guoqing Jiang 提交于
      [ Upstream commit a7ede3d16808b8f3915c8572d783530a82b2f027 ]
      
      With commit 6ce220dd2f8ea71d6afc29b9a7524c12e39f374a ("raid5: don't set
      STRIPE_HANDLE to stripe which is in batch list"), we don't want to set
      STRIPE_HANDLE flag for sh which is already in batch list.
      
      However, the stripe which is the head of batch list should set this flag,
      otherwise panic could happen inside init_stripe at BUG_ON(sh->batch_head),
      it is reproducible with raid5 on top of nvdimm devices per Xiao oberserved.
      
      Thanks for Xiao's effort to verify the change.
      
      Fixes: 6ce220dd2f8ea ("raid5: don't set STRIPE_HANDLE to stripe which is in batch list")
      Reported-by: NXiao Ni <xni@redhat.com>
      Tested-by: NXiao Ni <xni@redhat.com>
      Signed-off-by: NGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      a40982c7
    • H
      gpiolib: acpi: Add Terra Pad 1061 to the run_edge_events_on_boot_blacklist · 2de65064
      Hans de Goede 提交于
      [ Upstream commit 2727315df3f5ffbebcb174eed3153944a858b66f ]
      
      The Terra Pad 1061 has the usual micro-USB-B id-pin handler, but instead
      of controlling the actual micro-USB-B it turns the 5V boost for the
      tablet's USB-A connector and its keyboard-cover connector off.
      
      The actual micro-USB-B connector on the tablet is wired for charging only,
      and its id pin is *not* connected to the GPIO which is used for the
      (broken) id-pin event handler in the DSDT.
      
      While at it not only add a comment why the Terra Pad 1061 is on the
      blacklist, but also fix the missing comment for the Minix Neo Z83-4 entry.
      
      Fixes: 61f7f7c8f978 ("gpiolib: acpi: Add gpiolib_acpi_run_edge_events_on_boot option and blacklist")
      Signed-off-by: NHans de Goede <hdegoede@redhat.com>
      Reviewed-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Acked-by: NMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: NLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      2de65064
    • P
      cifs: Fix potential softlockups while refreshing DFS cache · 14cb20ad
      Paulo Alcantara (SUSE) 提交于
      [ Upstream commit 84a1f5b1cc6fd7f6cd99fc5630c36f631b19fa60 ]
      
      We used to skip reconnects on all SMB2_IOCTL commands due to SMB3+
      FSCTL_VALIDATE_NEGOTIATE_INFO - which made sense since we're still
      establishing a SMB session.
      
      However, when refresh_cache_worker() calls smb2_get_dfs_refer() and
      we're under reconnect, SMB2_ioctl() will not be able to get a proper
      status error (e.g. -EHOSTDOWN in case we failed to reconnect) but an
      -EAGAIN from cifs_send_recv() thus looping forever in
      refresh_cache_worker().
      
      Fixes: e99c63e4d86d ("SMB3: Fix deadlock in validate negotiate hits reconnect")
      Signed-off-by: NPaulo Alcantara (SUSE) <pc@cjr.nz>
      Suggested-by: NAurelien Aptel <aaptel@suse.com>
      Reviewed-by: NAurelien Aptel <aaptel@suse.com>
      Signed-off-by: NSteve French <stfrench@microsoft.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      14cb20ad
    • K
      kernel/module.c: wakeup processes in module_wq on module unload · 12c88d91
      Konstantin Khorenko 提交于
      [ Upstream commit 5d603311615f612320bb77bd2a82553ef1ced5b7 ]
      
      Fix the race between load and unload a kernel module.
      
      sys_delete_module()
       try_stop_module()
        mod->state = _GOING
      					add_unformed_module()
      					 old = find_module_all()
      					 (old->state == _GOING =>
      					  wait_event_interruptible())
      
      					 During pre-condition
      					 finished_loading() rets 0
      					 schedule()
      					 (never gets waken up later)
       free_module()
        mod->state = _UNFORMED
         list_del_rcu(&mod->list)
         (dels mod from "modules" list)
      
      return
      
      The race above leads to modprobe hanging forever on loading
      a module.
      
      Error paths on loading module call wake_up_all(&module_wq) after
      freeing module, so let's do the same on straight module unload.
      
      Fixes: 6e6de3dee51a ("kernel/module.c: Only return -EEXIST for modules that have finished loading")
      Reviewed-by: NPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: NKonstantin Khorenko <khorenko@virtuozzo.com>
      Signed-off-by: NJessica Yu <jeyu@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      12c88d91
    • F
      of: overlay: add_changeset_property() memory leak · 0773dcee
      Frank Rowand 提交于
      [ Upstream commit 637392a8506a3a7dd24ab9094a14f7522adb73b4 ]
      
      No changeset entries are created for #address-cells and #size-cells
      properties, but the duplicated properties are never freed.  This
      results in a memory leak which is detected by kmemleak:
      
       unreferenced object 0x85887180 (size 64):
         backtrace:
           kmem_cache_alloc_trace+0x1fb/0x1fc
           __of_prop_dup+0x25/0x7c
           add_changeset_property+0x17f/0x370
           build_changeset_next_level+0x29/0x20c
           of_overlay_fdt_apply+0x32b/0x6b4
           ...
      
      Fixes: 6f75118800ac ("of: overlay: validate overlay properties #address-cells and #size-cells")
      Reported-by: NVincent Whitchurch <vincent.whitchurch@axis.com>
      Signed-off-by: NFrank Rowand <frank.rowand@sony.com>
      Tested-by: NVincent Whitchurch <vincent.whitchurch@axis.com>
      Signed-off-by: NRob Herring <robh@kernel.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      0773dcee
    • B
      gfs2: fix glock reference problem in gfs2_trans_remove_revoke · 0809e108
      Bob Peterson 提交于
      [ Upstream commit fe5e7ba11fcf1d75af8173836309e8562aefedef ]
      
      Commit 9287c6452d2b fixed a situation in which gfs2 could use a glock
      after it had been freed. To do that, it temporarily added a new glock
      reference by calling gfs2_glock_hold in function gfs2_add_revoke.
      However, if the bd element was removed by gfs2_trans_remove_revoke, it
      failed to drop the additional reference.
      
      This patch adds logic to gfs2_trans_remove_revoke to properly drop the
      additional glock reference.
      
      Fixes: 9287c6452d2b ("gfs2: Fix occasional glock use-after-free")
      Cc: stable@vger.kernel.org # v5.2+
      Signed-off-by: NBob Peterson <rpeterso@redhat.com>
      Signed-off-by: NAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      0809e108
    • Y
      PCI: rcar: Fix missing MACCTLR register setting in initialization sequence · 2de11b2e
      Yoshihiro Shimoda 提交于
      [ Upstream commit 7c7e53e1c93df14690bd12c1f84730fef927a6f1 ]
      
      The R-Car Gen2/3 manual - available at:
      
      https://www.renesas.com/eu/en/products/microcontrollers-microprocessors/rz/rzg/rzg1m.html#documents
      
      "RZ/G Series User's Manual: Hardware" section
      
      strictly enforces the MACCTLR inizialization value - 39.3.1 - "Initial
      Setting of PCI Express":
      
      "Be sure to write the initial value (= H'80FF 0000) to MACCTLR before
      enabling PCIETCTLR.CFINIT".
      
      To avoid unexpected behavior and to match the SW initialization sequence
      guidelines, this patch programs the MACCTLR with the correct value.
      
      Note that the MACCTLR.SPCHG bit in the MACCTLR register description
      reports that "Only writing 1 is valid and writing 0 is invalid" but this
      "invalid" has to be interpreted as a write-ignore aka "ignored", not
      "prohibited".
      Reported-by: NEugeniu Rosca <erosca@de.adit-jv.com>
      Fixes: c25da477 ("PCI: rcar: Add Renesas R-Car PCIe driver")
      Fixes: be20bbcb0a8c ("PCI: rcar: Add the initialization of PCIe link in resume_noirq()")
      Signed-off-by: NYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: NLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Cc: <stable@vger.kernel.org> # v5.2+
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      2de11b2e
    • M
      leds: trigger: netdev: fix handling on interface rename · f1fd9d0b
      Martin Schiller 提交于
      [ Upstream commit 5f820ed52371b4f5d8c43c93f03408d0dbc01e5b ]
      
      The NETDEV_CHANGENAME code is not "unneeded" like it is stated in commit
      4cb6560514fa ("leds: trigger: netdev: fix refcnt leak on interface
      rename").
      
      The event was accidentally misinterpreted equivalent to
      NETDEV_UNREGISTER, but should be equivalent to NETDEV_REGISTER.
      
      This was the case in the original code from the openwrt project.
      
      Otherwise, you are unable to set netdev led triggers for (non-existent)
      netdevices, which has to be renamed. This is the case, for example, for
      ppp interfaces in openwrt.
      
      Fixes: 06f502f5 ("leds: trigger: Introduce a NETDEV trigger")
      Fixes: 4cb6560514fa ("leds: trigger: netdev: fix refcnt leak on interface rename")
      Signed-off-by: NMartin Schiller <ms@dev.tdt.de>
      Signed-off-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f1fd9d0b
    • E
      net/mlx5e: Fix SFF 8472 eeprom length · 935f3980
      Eran Ben Elisha 提交于
      [ Upstream commit c431f8597863a91eea6024926e0c1b179cfa4852 ]
      
      SFF 8472 eeprom length is 512 bytes. Fix module info return value to
      support 512 bytes read.
      
      Fixes: ace329f4ab3b ("net/mlx5e: ethtool, Remove unsupported SFP EEPROM high pages query")
      Signed-off-by: NEran Ben Elisha <eranbe@mellanox.com>
      Reviewed-by: NAya Levin <ayal@mellanox.com>
      Signed-off-by: NSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      935f3980
    • P
      sunrpc: fix crash when cache_head become valid before update · 67256225
      Pavel Tikhomirov 提交于
      [ Upstream commit 5fcaf6982d1167f1cd9b264704f6d1ef4c505d54 ]
      
      I was investigating a crash in our Virtuozzo7 kernel which happened in
      in svcauth_unix_set_client. I found out that we access m_client field
      in ip_map structure, which was received from sunrpc_cache_lookup (we
      have a bit older kernel, now the code is in sunrpc_cache_add_entry), and
      these field looks uninitialized (m_client == 0x74 don't look like a
      pointer) but in the cache_head in flags we see 0x1 which is CACHE_VALID.
      
      It looks like the problem appeared from our previous fix to sunrpc (1):
      commit 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued
      request")
      
      And we've also found a patch already fixing our patch (2):
      commit d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.")
      
      Though the crash is eliminated, I think the core of the problem is not
      completely fixed:
      
      Neil in the patch (2) makes cache_head CACHE_NEGATIVE, before
      cache_fresh_locked which was added in (1) to fix crash. These way
      cache_is_valid won't say the cache is valid anymore and in
      svcauth_unix_set_client the function cache_check will return error
      instead of 0, and we don't count entry as initialized.
      
      But it looks like we need to remove cache_fresh_locked completely in
      sunrpc_cache_lookup:
      
      In (1) we've only wanted to make cache_fresh_unlocked->cache_dequeue so
      that cache_requests with no readers also release corresponding
      cache_head, to fix their leak.  We with Vasily were not sure if
      cache_fresh_locked and cache_fresh_unlocked should be used in pair or
      not, so we've guessed to use them in pair.
      
      Now we see that we don't want the CACHE_VALID bit set here by
      cache_fresh_locked, as "valid" means "initialized" and there is no
      initialization in sunrpc_cache_add_entry. Both expiry_time and
      last_refresh are not used in cache_fresh_unlocked code-path and also not
      required for the initial fix.
      
      So to conclude cache_fresh_locked was called by mistake, and we can just
      safely remove it instead of crutching it with CACHE_NEGATIVE. It looks
      ideologically better for me. Hope I don't miss something here.
      
      Here is our crash backtrace:
      [13108726.326291] BUG: unable to handle kernel NULL pointer dereference at 0000000000000074
      [13108726.326365] IP: [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
      [13108726.326448] PGD 0
      [13108726.326468] Oops: 0002 [#1] SMP
      [13108726.326497] Modules linked in: nbd isofs xfs loop kpatch_cumulative_81_0_r1(O) xt_physdev nfnetlink_queue bluetooth rfkill ip6table_nat nf_nat_ipv6 ip_vs_wrr ip_vs_wlc ip_vs_sh nf_conntrack_netlink ip_vs_sed ip_vs_pe_sip nf_conntrack_sip ip_vs_nq ip_vs_lc ip_vs_lblcr ip_vs_lblc ip_vs_ftp ip_vs_dh nf_nat_ftp nf_conntrack_ftp iptable_raw xt_recent nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_TCPMSS xt_tcpmss vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_NFLOG nfnetlink_log dummy xt_mark xt_REDIRECT nf_nat_redirect raw_diag udp_diag tcp_diag inet_diag netlink_diag af_packet_diag unix_diag rpcsec_gss_krb5 xt_addrtype ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 ebtable_nat ebtable_broute nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle ip6table_raw nfsv4
      [13108726.327173]  dns_resolver cls_u32 binfmt_misc arptable_filter arp_tables ip6table_filter ip6_tables devlink fuse_kio_pcs ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 xt_comment nf_conntrack_ipv4 nf_defrag_ipv4 xt_wdog_tmo xt_multiport bonding xt_set xt_conntrack iptable_filter iptable_mangle kpatch(O) ebtable_filter ebt_among ebtables ip_set_hash_ip ip_set nfnetlink vfat fat skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass fuse pcspkr ses enclosure joydev sg mei_me hpwdt hpilo lpc_ich mei ipmi_si shpchp ipmi_devintf ipmi_msghandler xt_ipvs acpi_power_meter ip_vs_rr nfsv3 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache nf_nat cls_fw sch_htb sch_cbq sch_sfq ip_vs em_u32 nf_conntrack tun br_netfilter veth overlay ip6_vzprivnet ip6_vznetstat ip_vznetstat
      [13108726.327817]  ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat vznetdev vzmon vzdev bridge pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper scsi_transport_iscsi 8021q syscopyarea sysfillrect garp sysimgblt fb_sys_fops mrp stp ttm llc bnx2x crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel drm dm_multipath ghash_clmulni_intel uas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd tg3 smartpqi scsi_transport_sas mdio libcrc32c i2c_core usb_storage ptp pps_core wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: kpatch_cumulative_82_0_r1]
      [13108726.328403] CPU: 35 PID: 63742 Comm: nfsd ve: 51332 Kdump: loaded Tainted: G        W  O   ------------   3.10.0-862.20.2.vz7.73.29 #1 73.29
      [13108726.328491] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 10/02/2018
      [13108726.328554] task: ffffa0a6a41b1160 ti: ffffa0c2a74bc000 task.ti: ffffa0c2a74bc000
      [13108726.328610] RIP: 0010:[<ffffffffc01f79eb>]  [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
      [13108726.328706] RSP: 0018:ffffa0c2a74bfd80  EFLAGS: 00010246
      [13108726.328750] RAX: 0000000000000001 RBX: ffffa0a6183ae000 RCX: 0000000000000000
      [13108726.328811] RDX: 0000000000000074 RSI: 0000000000000286 RDI: ffffa0c2a74bfcf0
      [13108726.328864] RBP: ffffa0c2a74bfe00 R08: ffffa0bab8c22960 R09: 0000000000000001
      [13108726.328916] R10: 0000000000000001 R11: 0000000000000001 R12: ffffa0a32aa7f000
      [13108726.328969] R13: ffffa0a6183afac0 R14: ffffa0c233d88d00 R15: ffffa0c2a74bfdb4
      [13108726.329022] FS:  0000000000000000(0000) GS:ffffa0e17f9c0000(0000) knlGS:0000000000000000
      [13108726.329081] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [13108726.332311] CR2: 0000000000000074 CR3: 00000026a1b28000 CR4: 00000000007607e0
      [13108726.334606] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [13108726.336754] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [13108726.338908] PKRU: 00000000
      [13108726.341047] Call Trace:
      [13108726.343074]  [<ffffffff8a2c78b4>] ? groups_alloc+0x34/0x110
      [13108726.344837]  [<ffffffffc01f5eb4>] svc_set_client+0x24/0x30 [sunrpc]
      [13108726.346631]  [<ffffffffc01f2ac1>] svc_process_common+0x241/0x710 [sunrpc]
      [13108726.348332]  [<ffffffffc01f3093>] svc_process+0x103/0x190 [sunrpc]
      [13108726.350016]  [<ffffffffc07d605f>] nfsd+0xdf/0x150 [nfsd]
      [13108726.351735]  [<ffffffffc07d5f80>] ? nfsd_destroy+0x80/0x80 [nfsd]
      [13108726.353459]  [<ffffffff8a2bf741>] kthread+0xd1/0xe0
      [13108726.355195]  [<ffffffff8a2bf670>] ? create_kthread+0x60/0x60
      [13108726.356896]  [<ffffffff8a9556dd>] ret_from_fork_nospec_begin+0x7/0x21
      [13108726.358577]  [<ffffffff8a2bf670>] ? create_kthread+0x60/0x60
      [13108726.360240] Code: 4c 8b 45 98 0f 8e 2e 01 00 00 83 f8 fe 0f 84 76 fe ff ff 85 c0 0f 85 2b 01 00 00 49 8b 50 40 b8 01 00 00 00 48 89 93 d0 1a 00 00 <f0> 0f c1 02 83 c0 01 83 f8 01 0f 8e 53 02 00 00 49 8b 44 24 38
      [13108726.363769] RIP  [<ffffffffc01f79eb>] svcauth_unix_set_client+0x2ab/0x520 [sunrpc]
      [13108726.365530]  RSP <ffffa0c2a74bfd80>
      [13108726.367179] CR2: 0000000000000074
      
      Fixes: d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.")
      Signed-off-by: NPavel Tikhomirov <ptikhomirov@virtuozzo.com>
      Acked-by: NNeilBrown <neilb@suse.de>
      Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      67256225
    • W
      firmware: arm_scmi: Avoid double free in error flow · 372098d5
      Wen Yang 提交于
      [ Upstream commit 8305e90a894f82c278c17e51a28459deee78b263 ]
      
      If device_register() fails, both put_device() and kfree() are called,
      ending with a double free of the scmi_dev.
      
      Calling kfree() is needed only when a failure happens between the
      allocation of the scmi_dev and its registration, so move it to there
      and remove it from the error flow.
      
      Fixes: 46edb8d1322c ("firmware: arm_scmi: provide the mandatory device release callback")
      Signed-off-by: NWen Yang <wenyang@linux.alibaba.com>
      Signed-off-by: NSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      372098d5
    • C
      gre: refetch erspan header from skb->data after pskb_may_pull() · f7654ebe
      Cong Wang 提交于
      [ Upstream commit 0e4940928c26527ce8f97237fef4c8a91cd34207 ]
      
      After pskb_may_pull() we should always refetch the header
      pointers from the skb->data in case it got reallocated.
      
      In gre_parse_header(), the erspan header is still fetched
      from the 'options' pointer which is fetched before
      pskb_may_pull().
      
      Found this during code review of a KMSAN bug report.
      
      Fixes: cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup")
      Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: NCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: NLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Acked-by: NWilliam Tu <u9012063@gmail.com>
      Reviewed-by: NSimon Horman <simon.horman@netronome.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      f7654ebe
    • A
      perf callchain: Fix segfault in thread__resolve_callchain_sample() · 4f579272
      Adrian Hunter 提交于
      [ Upstream commit aceb98261ea7d9fe38f9c140c5531f0b13623832 ]
      
      Do not dereference 'chain' when it is NULL.
      
        $ perf record -e intel_pt//u -e branch-misses:u uname
        $ perf report --itrace=l --branch-history
        perf: Segmentation fault
      
      Fixes: e9024d519d89 ("perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc}")
      Signed-off-by: NAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lore.kernel.org/lkml/20191114142538.4097-1-adrian.hunter@intel.comSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      4f579272
    • T
      workqueue: Fix missing kfree(rescuer) in destroy_workqueue() · 1b83d575
      Tejun Heo 提交于
      commit 8efe1223d73c218ce7e8b2e0e9aadb974b582d7f upstream.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reported-by: NQian Cai <cai@lca.pw>
      Fixes: def98c84b6cd ("workqueue: Fix spurious sanity check failures in destroy_workqueue()")
      Cc: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1b83d575
    • M
      blk-mq: make sure that line break can be printed · d88fb4f0
      Ming Lei 提交于
      commit d2c9be89f8ebe7ebcc97676ac40f8dec1cf9b43a upstream.
      
      8962842ca5ab ("blk-mq: avoid sysfs buffer overflow with too many CPU cores")
      avoids sysfs buffer overflow, and reserves one character for line break.
      However, the last snprintf() doesn't get correct 'size' parameter passed
      in, so fixed it.
      
      Fixes: 8962842ca5ab ("blk-mq: avoid sysfs buffer overflow with too many CPU cores")
      Signed-off-by: NMing Lei <ming.lei@redhat.com>
      Signed-off-by: NJens Axboe <axboe@kernel.dk>
      Cc: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d88fb4f0
    • H
      s390/smp,vdso: fix ASCE handling · d7248f5a
      Heiko Carstens 提交于
      [ Upstream commit a2308c11ecbc3471ebb7435ee8075815b1502ef0 ]
      
      When a secondary CPU is brought up it must initialize its control
      registers. CPU A which triggers that a secondary CPU B is brought up
      stores its control register contents into the lowcore of new CPU B,
      which then loads these values on startup.
      
      This is problematic in various ways: the control register which
      contains the home space ASCE will correctly contain the kernel ASCE;
      however control registers for primary and secondary ASCEs are
      initialized with whatever values were present in CPU A.
      
      Typically:
      - the primary ASCE will contain the user process ASCE of the process
        that triggered onlining of CPU B.
      - the secondary ASCE will contain the percpu VDSO ASCE of CPU A.
      
      Due to lazy ASCE handling we may also end up with other combinations.
      
      When then CPU B switches to a different process (!= idle) it will
      fixup the primary ASCE. However the problem is that the (wrong) ASCE
      from CPU A was loaded into control register 1: as soon as an ASCE is
      attached (aka loaded) a CPU is free to generate TLB entries using that
      address space.
      Even though it is very unlikey that CPU B will actually generate such
      entries, this could result in TLB entries of the address space of the
      process that ran on CPU A. These entries shouldn't exist at all and
      could cause problems later on.
      
      Furthermore the secondary ASCE of CPU B will not be updated correctly.
      This means that processes may see wrong results or even crash if they
      access VDSO data on CPU B. The correct VDSO ASCE will eventually be
      loaded on return to user space as soon as the kernel executed a call
      to strnlen_user or an atomic futex operation on CPU B.
      
      Fix both issues by intializing the to be loaded control register
      contents with the correct ASCEs and also enforce (re-)loading of the
      ASCEs upon first context switch and return to user space.
      
      Fixes: 0aaba41b ("s390: remove all code using the access register mode")
      Cc: stable@vger.kernel.org # v4.15+
      Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      d7248f5a
    • M
      mm, thp, proc: report THP eligibility for each vma · c76adee3
      Michal Hocko 提交于
      [ Upstream commit 7635d9cbe8327e131a1d3d8517dc186c2796ce2e ]
      
      Userspace falls short when trying to find out whether a specific memory
      range is eligible for THP.  There are usecases that would like to know
      that
      http://lkml.kernel.org/r/alpine.DEB.2.21.1809251248450.50347@chino.kir.corp.google.com
      : This is used to identify heap mappings that should be able to fault thp
      : but do not, and they normally point to a low-on-memory or fragmentation
      : issue.
      
      The only way to deduce this now is to query for hg resp.  nh flags and
      confronting the state with the global setting.  Except that there is also
      PR_SET_THP_DISABLE that might change the picture.  So the final logic is
      not trivial.  Moreover the eligibility of the vma depends on the type of
      VMA as well.  In the past we have supported only anononymous memory VMAs
      but things have changed and shmem based vmas are supported as well these
      days and the query logic gets even more complicated because the
      eligibility depends on the mount option and another global configuration
      knob.
      
      Simplify the current state and report the THP eligibility in
      /proc/<pid>/smaps for each existing vma.  Reuse
      transparent_hugepage_enabled for this purpose.  The original
      implementation of this function assumes that the caller knows that the vma
      itself is supported for THP so make the core checks into
      __transparent_hugepage_enabled and use it for existing callers.
      __show_smap just use the new transparent_hugepage_enabled which also
      checks the vma support status (please note that this one has to be out of
      line due to include dependency issues).
      
      [mhocko@kernel.org: fix oops with NULL ->f_mapping]
        Link: http://lkml.kernel.org/r/20181224185106.GC16738@dhcp22.suse.cz
      Link: http://lkml.kernel.org/r/20181211143641.3503-3-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Paul Oppenheimer <bepvte@gmail.com>
      Cc: William Kucharski <william.kucharski@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      c76adee3
    • D
      mfd: rk808: Fix RK818 ID template · 8599f823
      Daniel Schultz 提交于
      [ Upstream commit 37ef8c2c15bdc1322b160e38986c187de2b877b2 ]
      
      The Rockchip PMIC driver can automatically detect connected component
      versions by reading the ID_MSB and ID_LSB registers. The probe function
      will always fail with RK818 PMICs because the ID_MSK is 0xFFF0 and the
      RK818 template ID is 0x8181.
      
      This patch changes this value to 0x8180.
      
      Fixes: 9d6105e1 ("mfd: rk808: Fix up the chip id get failed")
      Cc: stable@vger.kernel.org
      Cc: Elaine Zhang <zhangqing@rock-chips.com>
      Cc: Joseph Chen <chenjh@rock-chips.com>
      Signed-off-by: NDaniel Schultz <d.schultz@phytec.de>
      Signed-off-by: NHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: NLee Jones <lee.jones@linaro.org>
      Signed-off-by: NSasha Levin <sashal@kernel.org>
      8599f823
    • Y
      ext4: fix a bug in ext4_wait_for_tail_page_commit · b1ec93dd
      yangerkun 提交于
      commit 565333a1554d704789e74205989305c811fd9c7a upstream.
      
      No need to wait for any commit once the page is fully truncated.
      Besides, it may confuse e.g. concurrent ext4_writepage() with the page
      still be dirty (will be cleared by truncate_pagecache() in
      ext4_setattr()) but buffers has been freed; and then trigger a bug
      show as below:
      
      [   26.057508] ------------[ cut here ]------------
      [   26.058531] kernel BUG at fs/ext4/inode.c:2134!
      ...
      [   26.088130] Call trace:
      [   26.088695]  ext4_writepage+0x914/0xb28
      [   26.089541]  writeout.isra.4+0x1b4/0x2b8
      [   26.090409]  move_to_new_page+0x3b0/0x568
      [   26.091338]  __unmap_and_move+0x648/0x988
      [   26.092241]  unmap_and_move+0x48c/0xbb8
      [   26.093096]  migrate_pages+0x220/0xb28
      [   26.093945]  kernel_mbind+0x828/0xa18
      [   26.094791]  __arm64_sys_mbind+0xc8/0x138
      [   26.095716]  el0_svc_common+0x190/0x490
      [   26.096571]  el0_svc_handler+0x60/0xd0
      [   26.097423]  el0_svc+0x8/0xc
      
      Run the procedure (generate by syzkaller) parallel with ext3.
      
      void main()
      {
      	int fd, fd1, ret;
      	void *addr;
      	size_t length = 4096;
      	int flags;
      	off_t offset = 0;
      	char *str = "12345";
      
      	fd = open("a", O_RDWR | O_CREAT);
      	assert(fd >= 0);
      
      	/* Truncate to 4k */
      	ret = ftruncate(fd, length);
      	assert(ret == 0);
      
      	/* Journal data mode */
      	flags = 0xc00f;
      	ret = ioctl(fd, _IOW('f', 2, long), &flags);
      	assert(ret == 0);
      
      	/* Truncate to 0 */
      	fd1 = open("a", O_TRUNC | O_NOATIME);
      	assert(fd1 >= 0);
      
      	addr = mmap(NULL, length, PROT_WRITE | PROT_READ,
      					MAP_SHARED, fd, offset);
      	assert(addr != (void *)-1);
      
      	memcpy(addr, str, 5);
      	mbind(addr, length, 0, 0, 0, MPOL_MF_MOVE);
      }
      
      And the bug will be triggered once we seen the below order.
      
      reproduce1                         reproduce2
      
      ...                            |   ...
      truncate to 4k                 |
      change to journal data mode    |
                                     |   memcpy(set page dirty)
      truncate to 0:                 |
      ext4_setattr:                  |
      ...                            |
      ext4_wait_for_tail_page_commit |
                                     |   mbind(trigger bug)
      truncate_pagecache(clean dirty)|   ...
      ...                            |
      
      mbind will call ext4_writepage() since the page still be dirty, and then
      report the bug since the buffers has been free. Fix it by return
      directly once offset equals to 0 which means the page has been fully
      truncated.
      Reported-by: NHulk Robot <hulkci@huawei.com>
      Signed-off-by: Nyangerkun <yangerkun@huawei.com>
      Link: https://lore.kernel.org/r/20190919063508.1045-1-yangerkun@huawei.comReviewed-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b1ec93dd
    • D
      splice: only read in as much information as there is pipe buffer space · 326ba910
      Darrick J. Wong 提交于
      commit 3253d9d093376d62b4a56e609f15d2ec5085ac73 upstream.
      
      Andreas Grünbacher reports that on the two filesystems that support
      iomap directio, it's possible for splice() to return -EAGAIN (instead of
      a short splice) if the pipe being written to has less space available in
      its pipe buffers than the length supplied by the calling process.
      
      Months ago we fixed splice_direct_to_actor to clamp the length of the
      read request to the size of the splice pipe.  Do the same to do_splice.
      
      Fixes: 17614445576b6 ("splice: don't read more than available pipe space")
      Reported-by: syzbot+3c01db6025f26530cf8d@syzkaller.appspotmail.com
      Reported-by: NAndreas Grünbacher <andreas.gruenbacher@gmail.com>
      Reviewed-by: NAndreas Grünbacher <andreas.gruenbacher@gmail.com>
      Signed-off-by: NDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      326ba910
    • A
      rtc: disable uie before setting time and enable after · 42a929ed
      Alexandre Belloni 提交于
      commit 7e7c005b4b1f1f169bcc4b2c3a40085ecc663df2 upstream.
      
      When setting the time in the future with the uie timer enabled,
      rtc_timer_do_work will loop for a while because the expiration of the uie
      timer was way before the current RTC time and a new timer will be enqueued
      until the current rtc time is reached.
      
      If the uie timer is enabled, disable it before setting the time and enable
      it after expiring current timers (which may actually be an alarm).
      
      This is the safest thing to do to ensure the uie timer is still
      synchronized with the RTC, especially in the UIE emulation case.
      
      Reported-by: syzbot+08116743f8ad6f9a6de7@syzkaller.appspotmail.com
      Fixes: 6610e089 ("RTC: Rework RTC code to use timerqueue for events")
      Link: https://lore.kernel.org/r/20191020231320.8191-1-alexandre.belloni@bootlin.comSigned-off-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42a929ed