1. 08 10月, 2018 13 次提交
  2. 06 10月, 2018 3 次提交
    • M
      net: mvpp2: Extract the correct ethtype from the skb for tx csum offload · 35f3625c
      Maxime Chevallier 提交于
      When offloading the L3 and L4 csum computation on TX, we need to extract
      the l3_proto from the ethtype, independently of the presence of a vlan
      tag.
      
      The actual driver uses skb->protocol as-is, resulting in packets with
      the wrong L4 checksum being sent when there's a vlan tag in the packet
      header and checksum offloading is enabled.
      
      This commit makes use of vlan_protocol_get() to get the correct ethtype
      regardless the presence of a vlan tag.
      
      Fixes: 3f518509 ("ethernet: Add new driver for Marvell Armada 375 network unit")
      Signed-off-by: NMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      35f3625c
    • W
      yam: fix a missing-check bug · 0781168e
      Wenwen Wang 提交于
      In yam_ioctl(), the concrete ioctl command is firstly copied from the
      user-space buffer 'ifr->ifr_data' to 'ioctl_cmd' and checked through the
      following switch statement. If the command is not as expected, an error
      code EINVAL is returned. In the following execution the buffer
      'ifr->ifr_data' is copied again in the cases of the switch statement to
      specific data structures according to what kind of ioctl command is
      requested. However, after the second copy, no re-check is enforced on the
      newly-copied command. Given that the buffer 'ifr->ifr_data' is in the user
      space, a malicious user can race to change the command between the two
      copies. This way, the attacker can inject inconsistent data and cause
      undefined behavior.
      
      This patch adds a re-check in each case of the switch statement if there is
      a second copy in that case, to re-check whether the command obtained in the
      second copy is the same as the one in the first copy. If not, an error code
      EINVAL will be returned.
      Signed-off-by: NWenwen Wang <wang6495@umn.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      0781168e
    • W
      net: cxgb3_main: fix a missing-check bug · 2c05d888
      Wenwen Wang 提交于
      In cxgb_extension_ioctl(), the command of the ioctl is firstly copied from
      the user-space buffer 'useraddr' to 'cmd' and checked through the
      switch statement. If the command is not as expected, an error code
      EOPNOTSUPP is returned. In the following execution, i.e., the cases of the
      switch statement, the whole buffer of 'useraddr' is copied again to a
      specific data structure, according to what kind of command is requested.
      However, after the second copy, there is no re-check on the newly-copied
      command. Given that the buffer 'useraddr' is in the user space, a malicious
      user can race to change the command between the two copies. By doing so,
      the attacker can supply malicious data to the kernel and cause undefined
      behavior.
      
      This patch adds a re-check in each case of the switch statement if there is
      a second copy in that case, to re-check whether the command obtained in the
      second copy is the same as the one in the first copy. If not, an error code
      EINVAL is returned.
      Signed-off-by: NWenwen Wang <wang6495@umn.edu>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2c05d888
  3. 05 10月, 2018 16 次提交
    • J
      i2c: designware: Call i2c_dw_clk_rate() only when calculating timings · 9ce7610e
      Jarkko Nikula 提交于
      There are platforms which don't provide input clock rate but provide
      I2C timing parameters. Commit 3bd4f277 ("i2c: designware: Call
      i2c_dw_clk_rate() only once in i2c_dw_init_master()") causes needless
      warning during probe on those platforms since i2c_dw_clk_rate(), which
      causes the warning when input clock is unknown, is called even when
      there is no need to calculate timing parameters.
      
      Fixes: 3bd4f277 ("i2c: designware: Call i2c_dw_clk_rate() only once in i2c_dw_init_master()")
      Reported-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: <stable@vger.kernel.org> # 4.19
      Signed-off-by: NJarkko Nikula <jarkko.nikula@linux.intel.com>
      Tested-by: NArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
      9ce7610e
    • S
      iommu/amd: Clear memory encryption mask from physical address · b3e9b515
      Singh, Brijesh 提交于
      Boris Ostrovsky reported a memory leak with device passthrough when SME
      is active.
      
      The VFIO driver uses iommu_iova_to_phys() to get the physical address for
      an iova. This physical address is later passed into vfio_unmap_unpin() to
      unpin the memory. The vfio_unmap_unpin() uses pfn_valid() before unpinning
      the memory. The pfn_valid() check was failing because encryption mask was
      part of the physical address returned. This resulted in the memory not
      being unpinned and therefore leaked after the guest terminates.
      
      The memory encryption mask must be cleared from the physical address in
      iommu_iova_to_phys().
      
      Fixes: 2543a786 ("iommu/amd: Allow the AMD IOMMU to work with memory encryption")
      Reported-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: <iommu@lists.linux-foundation.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: <stable@vger.kernel.org> # 4.14+
      Signed-off-by: NBrijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: NJoerg Roedel <jroedel@suse.de>
      b3e9b515
    • B
      net: phy: phylink: fix SFP interface autodetection · 7e418375
      Baruch Siach 提交于
      When connecting SFP PHY to phylink use the detected interface.
      Otherwise, the link fails to come up when the configured 'phy-mode'
      differs from the SFP detected mode.
      
      Move most of phylink_connect_phy() into __phylink_connect_phy(), and
      leave phylink_connect_phy() as a wrapper. phylink_sfp_connect_phy() can
      now pass the SFP detected PHY interface to __phylink_connect_phy().
      
      This fixes 1GB SFP module link up on eth3 of the Macchiatobin board that
      is configured in the DT to "2500base-x" phy-mode.
      
      Fixes: 9525ae83 ("phylink: add phylink infrastructure")
      Suggested-by: NRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: NBaruch Siach <baruch@tkos.co.il>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      7e418375
    • D
      be2net: don't flip hw_features when VXLANs are added/deleted · 2d52527e
      Davide Caratti 提交于
      the be2net implementation of .ndo_tunnel_{add,del}() changes the value of
      NETIF_F_GSO_UDP_TUNNEL bit in 'features' and 'hw_features', but it forgets
      to call netdev_features_change(). Moreover, ethtool setting for that bit
      can potentially be reverted after a tunnel is added or removed.
      
      GSO already does software segmentation when 'hw_enc_features' is 0, even
      if VXLAN offload is turned on. In addition, commit 096de2f8 ("benet:
      stricter vxlan offloading check in be_features_check") avoids hardware
      segmentation of non-VXLAN tunneled packets, or VXLAN packets having wrong
      destination port. So, it's safe to avoid flipping the above feature on
      addition/deletion of VXLAN tunnels.
      
      Fixes: 630f4b70 ("be2net: Export tunnel offloads only when a VxLAN tunnel is created")
      Signed-off-by: NDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      2d52527e
    • F
      net: dsa: b53: Keep CPU port as tagged in all VLANs · ca893194
      Florian Fainelli 提交于
      Commit c499696e ("net: dsa: b53: Stop using dev->cpu_port
      incorrectly") was a bit too trigger happy in removing the CPU port from
      the VLAN membership because we rely on DSA to program the CPU port VLAN,
      which it does, except it does not bother itself with tagged/untagged and
      just usese untagged.
      
      Having the CPU port "follow" the user ports tagged/untagged is not great
      and does not allow for properly differentiating, so keep the CPU port
      tagged in all VLANs.
      Reported-by: NGerhard Wiesinger <lists@wiesinger.com>
      Fixes: c499696e ("net: dsa: b53: Stop using dev->cpu_port incorrectly")
      Signed-off-by: NFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      ca893194
    • V
      bnxt_en: get the reduced max_irqs by the ones used by RDMA · c78fe058
      Vasundhara Volam 提交于
      When getting the max rings supported, get the reduced max_irqs
      by the ones used by RDMA.
      
      If the number MSIX is the limiting factor, this bug may cause the
      max ring count to be higher than it should be when RDMA driver is
      loaded and may result in ring allocation failures.
      
      Fixes: 30f52947 ("bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs.")
      Signed-off-by: NVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c78fe058
    • V
      bnxt_en: free hwrm resources, if driver probe fails. · a2bf74f4
      Venkat Duvvuru 提交于
      When the driver probe fails, all the resources that were allocated prior
      to the failure must be freed. However, hwrm dma response memory is not
      getting freed.
      
      This patch fixes the problem described above.
      
      Fixes: c0c050c5 ("bnxt_en: New Broadcom ethernet driver.")
      Signed-off-by: NVenkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a2bf74f4
    • V
      bnxt_en: Fix enables field in HWRM_QUEUE_COS2BW_CFG request · 5db0e096
      Vasundhara Volam 提交于
      In HWRM_QUEUE_COS2BW_CFG request, enables field should have the bits
      set only for the queue ids which are having the valid parameters.
      
      This causes firmware to return error when the TC to hardware CoS queue
      mapping is not 1:1 during DCBNL ETS setup.
      
      Fixes: 2e8ef77e ("bnxt_en: Add TC to hardware QoS queue mapping logic.")
      Signed-off-by: NVasundhara Volam <vasundhara-v.volam@broadcom.com>
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      5db0e096
    • M
      bnxt_en: Fix VNIC reservations on the PF. · dbe80d44
      Michael Chan 提交于
      The enables bit for VNIC was set wrong when calling the HWRM_FUNC_CFG
      firmware call to reserve VNICs.  This has the effect that the firmware
      will keep a large number of VNICs for the PF, and having very few for
      VFs.  DPDK driver running on the VFs, which requires more VNICs, may not
      work properly as a result.
      
      Fixes: 674f50a5 ("bnxt_en: Implement new method to reserve rings.")
      Signed-off-by: NMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      dbe80d44
    • I
      team: Forbid enslaving team device to itself · 471b83bd
      Ido Schimmel 提交于
      team's ndo_add_slave() acquires 'team->lock' and later tries to open the
      newly enslaved device via dev_open(). This emits a 'NETDEV_UP' event
      that causes the VLAN driver to add VLAN 0 on the team device. team's
      ndo_vlan_rx_add_vid() will also try to acquire 'team->lock' and
      deadlock.
      
      Fix this by checking early at the enslavement function that a team
      device is not being enslaved to itself.
      
      A similar check was added to the bond driver in commit 09a89c21
      ("bonding: disallow enslaving a bond to itself").
      
      WARNING: possible recursive locking detected
      4.18.0-rc7+ #176 Not tainted
      --------------------------------------------
      syz-executor4/6391 is trying to acquire lock:
      (____ptrval____) (&team->lock){+.+.}, at: team_vlan_rx_add_vid+0x3b/0x1e0 drivers/net/team/team.c:1868
      
      but task is already holding lock:
      (____ptrval____) (&team->lock){+.+.}, at: team_add_slave+0xdb/0x1c30 drivers/net/team/team.c:1947
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&team->lock);
        lock(&team->lock);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      2 locks held by syz-executor4/6391:
       #0: (____ptrval____) (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:77 [inline]
       #0: (____ptrval____) (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x412/0xc30 net/core/rtnetlink.c:4662
       #1: (____ptrval____) (&team->lock){+.+.}, at: team_add_slave+0xdb/0x1c30 drivers/net/team/team.c:1947
      
      stack backtrace:
      CPU: 1 PID: 6391 Comm: syz-executor4 Not tainted 4.18.0-rc7+ #176
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
       print_deadlock_bug kernel/locking/lockdep.c:1765 [inline]
       check_deadlock kernel/locking/lockdep.c:1809 [inline]
       validate_chain kernel/locking/lockdep.c:2405 [inline]
       __lock_acquire.cold.64+0x1fb/0x486 kernel/locking/lockdep.c:3435
       lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924
       __mutex_lock_common kernel/locking/mutex.c:757 [inline]
       __mutex_lock+0x176/0x1820 kernel/locking/mutex.c:894
       mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:909
       team_vlan_rx_add_vid+0x3b/0x1e0 drivers/net/team/team.c:1868
       vlan_add_rx_filter_info+0x14a/0x1d0 net/8021q/vlan_core.c:210
       __vlan_vid_add net/8021q/vlan_core.c:278 [inline]
       vlan_vid_add+0x63e/0x9d0 net/8021q/vlan_core.c:308
       vlan_device_event.cold.12+0x2a/0x2f net/8021q/vlan.c:381
       notifier_call_chain+0x180/0x390 kernel/notifier.c:93
       __raw_notifier_call_chain kernel/notifier.c:394 [inline]
       raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
       call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1735
       call_netdevice_notifiers net/core/dev.c:1753 [inline]
       dev_open+0x173/0x1b0 net/core/dev.c:1433
       team_port_add drivers/net/team/team.c:1219 [inline]
       team_add_slave+0xa8b/0x1c30 drivers/net/team/team.c:1948
       do_set_master+0x1c9/0x220 net/core/rtnetlink.c:2248
       do_setlink+0xba4/0x3e10 net/core/rtnetlink.c:2382
       rtnl_setlink+0x2a9/0x400 net/core/rtnetlink.c:2636
       rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4665
       netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2455
       rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4683
       netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
       netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
       netlink_sendmsg+0xa18/0xfd0 net/netlink/af_netlink.c:1908
       sock_sendmsg_nosec net/socket.c:642 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:652
       ___sys_sendmsg+0x7fd/0x930 net/socket.c:2126
       __sys_sendmsg+0x11d/0x290 net/socket.c:2164
       __do_sys_sendmsg net/socket.c:2173 [inline]
       __se_sys_sendmsg net/socket.c:2171 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2171
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x456b29
      Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f9706bf8c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f9706bf96d4 RCX: 0000000000456b29
      RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000004
      RBP: 00000000009300a0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000004d3548 R14: 00000000004c8227 R15: 0000000000000000
      
      Fixes: 87002b03 ("net: introduce vlan_vid_[add/del] and use them instead of direct [add/kill]_vid ndo calls")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reported-and-tested-by: syzbot+bd051aba086537515cdb@syzkaller.appspotmail.com
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      471b83bd
    • Y
      net/usb: cancel pending work when unbinding smsc75xx · f7b2a56e
      Yu Zhao 提交于
      Cancel pending work before freeing smsc75xx private data structure
      during binding. This fixes the following crash in the driver:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000050
      IP: mutex_lock+0x2b/0x3f
      <snipped>
      Workqueue: events smsc75xx_deferred_multicast_write [smsc75xx]
      task: ffff8caa83e85700 task.stack: ffff948b80518000
      RIP: 0010:mutex_lock+0x2b/0x3f
      <snipped>
      Call Trace:
       smsc75xx_deferred_multicast_write+0x40/0x1af [smsc75xx]
       process_one_work+0x18d/0x2fc
       worker_thread+0x1a2/0x269
       ? pr_cont_work+0x58/0x58
       kthread+0xfa/0x10a
       ? pr_cont_work+0x58/0x58
       ? rcu_read_unlock_sched_notrace+0x48/0x48
       ret_from_fork+0x22/0x40
      Signed-off-by: NYu Zhao <yuzhao@google.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f7b2a56e
    • M
      dm cache: fix resize crash if user doesn't reload cache table · 5d07384a
      Mike Snitzer 提交于
      A reload of the cache's DM table is needed during resize because
      otherwise a crash will occur when attempting to access smq policy
      entries associated with the portion of the cache that was recently
      extended.
      
      The reason is cache-size based data structures in the policy will not be
      resized, the only way to safely extend the cache is to allow for a
      proper cache policy initialization that occurs when the cache table is
      loaded.  For example the smq policy's space_init(), init_allocator(),
      calc_hotspot_params() must be sized based on the extended cache size.
      
      The fix for this is to disallow cache resizes of this pattern:
      1) suspend "cache" target's device
      2) resize the fast device used for the cache
      3) resume "cache" target's device
      
      Instead, the last step must be a full reload of the cache's DM table.
      
      Fixes: 66a63635 ("dm cache: add stochastic-multi-queue (smq) policy")
      Cc: stable@vger.kernel.org
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      5d07384a
    • J
      dm cache metadata: ignore hints array being too small during resize · 4561ffca
      Joe Thornber 提交于
      Commit fd2fa954 ("dm cache metadata: save in-core policy_hint_size to
      on-disk superblock") enabled previously written policy hints to be
      used after a cache is reactivated.  But in doing so the cache
      metadata's hint array was left exposed to out of bounds access because
      on resize the metadata's on-disk hint array wasn't ever extended.
      
      Fix this by ignoring that there are no on-disk hints associated with the
      newly added cache blocks.  An expanded on-disk hint array is later
      rewritten upon the next clean shutdown of the cache.
      
      Fixes: fd2fa954 ("dm cache metadata: save in-core policy_hint_size to on-disk superblock")
      Cc: stable@vger.kernel.org
      Signed-off-by: NJoe Thornber <ejt@redhat.com>
      Signed-off-by: NMike Snitzer <snitzer@redhat.com>
      4561ffca
    • R
      PM / core: Clear the direct_complete flag on errors · 69e445ab
      Rafael J. Wysocki 提交于
      If __device_suspend() runs asynchronously (in which case the device
      passed to it is in dpm_suspended_list at that point) and it returns
      early on an error or pending wakeup, and the power.direct_complete
      flag has been set for the device already, the subsequent
      device_resume() will be confused by that and it will call
      pm_runtime_enable() incorrectly, as runtime PM has not been
      disabled for the device by __device_suspend().
      
      To avoid that, clear power.direct_complete if __device_suspend()
      is not going to disable runtime PM for the device before returning.
      
      Fixes: aae4518b (PM / sleep: Mechanism to avoid resuming runtime-suspended devices unnecessarily)
      Reported-by: NAl Cooper <alcooperx@gmail.com>
      Tested-by: NAl Cooper <alcooperx@gmail.com>
      Reviewed-by: NUlf Hansson <ulf.hansson@linaro.org>
      Cc: 3.16+ <stable@vger.kernel.org> # 3.16+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      69e445ab
    • I
      mlxsw: spectrum: Delete RIF when VLAN device is removed · c360867e
      Ido Schimmel 提交于
      In commit 602b74ed ("mlxsw: spectrum_switchdev: Do not leak RIFs
      when removing bridge") I handled the case where RIFs created for VLAN
      devices were not properly cleaned up when their real device (a bridge)
      was removed.
      
      However, I forgot to handle the case of the VLAN device itself being
      removed. Do so now when the VLAN device is being unlinked from its real
      device.
      
      Fixes: 99f44bb3 ("mlxsw: spectrum: Enable L3 interfaces on top of bridge devices")
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Reported-by: NArtem Shvorin <art@qrator.net>
      Tested-by: NArtem Shvorin <art@qrator.net>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      c360867e
    • N
      mlxsw: pci: Derive event type from event queue number · f3c84a8e
      Nir Dotan 提交于
      Due to a hardware issue in Spectrum-2, the field event_type of the event
      queue element (EQE) has become reserved. It was used to distinguish between
      command interface completion events and completion events.
      
      Use queue number to determine event type, as command interface completion
      events are always received on EQ0 and mlxsw driver maps completion events
      to EQ1.
      
      Fixes: c3ab4354 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC")
      Signed-off-by: NNir Dotan <nird@mellanox.com>
      Reviewed-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      f3c84a8e
  4. 04 10月, 2018 3 次提交
    • F
      drm/amdkfd: Fix incorrect use of process->mm · 11b29c9e
      Felix Kuehling 提交于
      This mm_struct pointer should never be dereferenced. If running in
      a user thread, just use current->mm. If running in a kernel worker
      use get_task_mm to get a safe reference to the mm_struct.
      Reviewed-by: NOded Gabbay <oded.gabbay@gmail.com>
      Acked-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      11b29c9e
    • S
      drm/amd/display: Signal hw_done() after waiting for flip_done() · 987bf116
      Shirish S 提交于
      In amdgpu_dm_commit_tail(), wait until flip_done() is signaled before
      we signal hw_done().
      
      [Why]
      
      This is to temporarily address a paging error that occurs when a
      nonblocking commit contends with another commit, particularly in a
      mirrored display configuration where at least 2 CRTCs are updated.
      The error occurs in drm_atomic_helper_wait_for_flip_done(), when we
      attempt to access the contents of new_crtc_state->commit.
      
      Here's the sequence for a mirrored 2 display setup (irrelevant steps
      left out for clarity):
      
      **THREAD 1**                        | **THREAD 2**
                                          |
      Initialize atomic state for flip    |
                                          |
      Queue worker                        |
                                         ...
      
                                          | Do work for flip
                                          |
                                          | Signal hw_done() on CRTC 1
                                          | Signal hw_done() on CRTC 2
                                          |
                                          | Wait for flip_done() on CRTC 1
      
                                      <---- **PREEMPTED BY THREAD 1**
      
      Initialize atomic state for cursor  |
      update (1)                          |
                                          |
      Do cursor update work on both CRTCs |
                                          |
      Clear atomic state (2)              |
      **DONE**                            |
                                         ...
                                          |
                                          | Wait for flip_done() on CRTC 2
                                          | *ERROR*
                                          |
      
      The issue starts with (1). When the atomic state is initialized, the
      current CRTC states are duplicated to be the new_crtc_states, and
      referenced to be the old_crtc_states. (The new_crtc_states are to be
      filled with update data.)
      
      Some things to note:
      
      * Due to the mirrored configuration, the cursor updates on both CRTCs.
      
      * At this point, the pflip IRQ has already been handled, and flip_done
        signaled on all CRTCs. The cursor commit can therefore continue.
      
      * The old_crtc_states used by the cursor update are the **same states**
        as the new_crtc_states used by the flip worker.
      
      At (2), the old_crtc_state is freed (*), and the cursor commit
      completes. We then context switch back to the flip worker, where we
      attempt to access the new_crtc_state->commit object. This is
      problematic, as this state has already been freed.
      
      (*) Technically, 'state->crtcs[i].state' is freed, which was made to
          reference old_crtc_state in drm_atomic_helper_swap_state()
      
      [How]
      
      By moving hw_done() after wait_for_flip_done(), we're guaranteed that
      the new_crtc_state (from the flip worker's perspective) still exists.
      This is because any other commit will be blocked, waiting for the
      hw_done() signal.
      
      Note that both the i915 and imx drivers have this sequence flipped
      already, masking this problem.
      Signed-off-by: NShirish S <shirish.s@amd.com>
      Signed-off-by: NLeo Li <sunpeng.li@amd.com>
      Reviewed-by: NHarry Wentland <harry.wentland@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      987bf116
    • S
      ixgbe: check return value of napi_complete_done() · 4233cfe6
      Song Liu 提交于
      The NIC driver should only enable interrupts when napi_complete_done()
      returns true. This patch adds the check for ixgbe.
      
      Cc: stable@vger.kernel.org # 4.10+
      Suggested-by: NEric Dumazet <edumazet@google.com>
      Signed-off-by: NSong Liu <songliubraving@fb.com>
      Tested-by: NAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: NJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4233cfe6
  5. 03 10月, 2018 5 次提交