1. 27 11月, 2018 1 次提交
  2. 04 10月, 2018 1 次提交
    • S
      drm/amd/display: Signal hw_done() after waiting for flip_done() · 987bf116
      Shirish S 提交于
      In amdgpu_dm_commit_tail(), wait until flip_done() is signaled before
      we signal hw_done().
      
      [Why]
      
      This is to temporarily address a paging error that occurs when a
      nonblocking commit contends with another commit, particularly in a
      mirrored display configuration where at least 2 CRTCs are updated.
      The error occurs in drm_atomic_helper_wait_for_flip_done(), when we
      attempt to access the contents of new_crtc_state->commit.
      
      Here's the sequence for a mirrored 2 display setup (irrelevant steps
      left out for clarity):
      
      **THREAD 1**                        | **THREAD 2**
                                          |
      Initialize atomic state for flip    |
                                          |
      Queue worker                        |
                                         ...
      
                                          | Do work for flip
                                          |
                                          | Signal hw_done() on CRTC 1
                                          | Signal hw_done() on CRTC 2
                                          |
                                          | Wait for flip_done() on CRTC 1
      
                                      <---- **PREEMPTED BY THREAD 1**
      
      Initialize atomic state for cursor  |
      update (1)                          |
                                          |
      Do cursor update work on both CRTCs |
                                          |
      Clear atomic state (2)              |
      **DONE**                            |
                                         ...
                                          |
                                          | Wait for flip_done() on CRTC 2
                                          | *ERROR*
                                          |
      
      The issue starts with (1). When the atomic state is initialized, the
      current CRTC states are duplicated to be the new_crtc_states, and
      referenced to be the old_crtc_states. (The new_crtc_states are to be
      filled with update data.)
      
      Some things to note:
      
      * Due to the mirrored configuration, the cursor updates on both CRTCs.
      
      * At this point, the pflip IRQ has already been handled, and flip_done
        signaled on all CRTCs. The cursor commit can therefore continue.
      
      * The old_crtc_states used by the cursor update are the **same states**
        as the new_crtc_states used by the flip worker.
      
      At (2), the old_crtc_state is freed (*), and the cursor commit
      completes. We then context switch back to the flip worker, where we
      attempt to access the new_crtc_state->commit object. This is
      problematic, as this state has already been freed.
      
      (*) Technically, 'state->crtcs[i].state' is freed, which was made to
          reference old_crtc_state in drm_atomic_helper_swap_state()
      
      [How]
      
      By moving hw_done() after wait_for_flip_done(), we're guaranteed that
      the new_crtc_state (from the flip worker's perspective) still exists.
      This is because any other commit will be blocked, waiting for the
      hw_done() signal.
      
      Note that both the i915 and imx drivers have this sequence flipped
      already, masking this problem.
      Signed-off-by: NShirish S <shirish.s@amd.com>
      Signed-off-by: NLeo Li <sunpeng.li@amd.com>
      Reviewed-by: NHarry Wentland <harry.wentland@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      987bf116
  3. 27 9月, 2018 1 次提交
  4. 22 8月, 2018 1 次提交
  5. 07 8月, 2018 3 次提交
  6. 21 7月, 2018 2 次提交
  7. 20 7月, 2018 1 次提交
  8. 14 7月, 2018 4 次提交
  9. 11 7月, 2018 2 次提交
  10. 06 7月, 2018 4 次提交
  11. 05 7月, 2018 2 次提交
  12. 28 6月, 2018 1 次提交
  13. 23 6月, 2018 1 次提交
    • L
      drm/amdgpu: Count disabled CRTCs in commit tail earlier · fe2a1965
      Lyude Paul 提交于
      This fixes a regression I accidentally reduced that was picked up by
      kasan, where we were checking the CRTC atomic states after DRM's helpers
      had already freed them. Example:
      
      ==================================================================
      BUG: KASAN: use-after-free in amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
      Read of size 1 at addr ffff8803a697b071 by task kworker/u16:0/7
      
      CPU: 7 PID: 7 Comm: kworker/u16:0 Tainted: G           O      4.18.0-rc1Lyude-Upstream+ #1
      Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.21 05/02/2018
      Workqueue: events_unbound commit_work [drm_kms_helper]
      Call Trace:
       dump_stack+0xc1/0x169
       ? dump_stack_print_info.cold.1+0x42/0x42
       ? kmsg_dump_rewind_nolock+0xd9/0xd9
       ? printk+0x9f/0xc5
       ? amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
       print_address_description+0x6c/0x23c
       ? amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
       kasan_report.cold.6+0x241/0x2fd
       amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu]
       ? commit_planes_to_stream.constprop.45+0x13b0/0x13b0 [amdgpu]
       ? cpu_load_update_active+0x290/0x290
       ? finish_task_switch+0x2bd/0x840
       ? __switch_to_asm+0x34/0x70
       ? read_word_at_a_time+0xe/0x20
       ? strscpy+0x14b/0x460
       ? drm_atomic_helper_wait_for_dependencies+0x47d/0x7e0 [drm_kms_helper]
       commit_tail+0x96/0xe0 [drm_kms_helper]
       process_one_work+0x88a/0x1360
       ? create_worker+0x540/0x540
       ? __sched_text_start+0x8/0x8
       ? move_queued_task+0x760/0x760
       ? call_rcu_sched+0x20/0x20
       ? vsnprintf+0xcda/0x1350
       ? wait_woken+0x1c0/0x1c0
       ? mutex_unlock+0x1d/0x40
       ? init_timer_key+0x190/0x230
       ? schedule+0xea/0x390
       ? __schedule+0x1ea0/0x1ea0
       ? need_to_create_worker+0xe4/0x210
       ? init_worker_pool+0x700/0x700
       ? try_to_del_timer_sync+0xbf/0x110
       ? del_timer+0x120/0x120
       ? __mutex_lock_slowpath+0x10/0x10
       worker_thread+0x196/0x11f0
       ? flush_rcu_work+0x50/0x50
       ? __switch_to_asm+0x34/0x70
       ? __switch_to_asm+0x34/0x70
       ? __switch_to_asm+0x40/0x70
       ? __switch_to_asm+0x34/0x70
       ? __switch_to_asm+0x40/0x70
       ? __switch_to_asm+0x34/0x70
       ? __switch_to_asm+0x40/0x70
       ? __schedule+0x7d6/0x1ea0
       ? migrate_swap_stop+0x850/0x880
       ? __sched_text_start+0x8/0x8
       ? save_stack+0x8c/0xb0
       ? kasan_kmalloc+0xbf/0xe0
       ? kmem_cache_alloc_trace+0xe4/0x190
       ? kthread+0x98/0x390
       ? ret_from_fork+0x35/0x40
       ? ret_from_fork+0x35/0x40
       ? deactivate_slab.isra.67+0x3c4/0x5c0
       ? kthread+0x98/0x390
       ? kthread+0x98/0x390
       ? set_track+0x76/0x120
       ? schedule+0xea/0x390
       ? __schedule+0x1ea0/0x1ea0
       ? wait_woken+0x1c0/0x1c0
       ? kasan_unpoison_shadow+0x30/0x40
       ? parse_args.cold.15+0x17a/0x17a
       ? flush_rcu_work+0x50/0x50
       kthread+0x2d4/0x390
       ? kthread_create_worker_on_cpu+0xc0/0xc0
       ret_from_fork+0x35/0x40
      
      Allocated by task 1124:
       kasan_kmalloc+0xbf/0xe0
       kmem_cache_alloc_trace+0xe4/0x190
       dm_crtc_duplicate_state+0x78/0x130 [amdgpu]
       drm_atomic_get_crtc_state+0x147/0x410 [drm]
       page_flip_common+0x57/0x230 [drm_kms_helper]
       drm_atomic_helper_page_flip+0xa6/0x110 [drm_kms_helper]
       drm_mode_page_flip_ioctl+0xc4b/0x10a0 [drm]
       drm_ioctl_kernel+0x1d4/0x260 [drm]
       drm_ioctl+0x433/0x920 [drm]
       amdgpu_drm_ioctl+0x11d/0x290 [amdgpu]
       do_vfs_ioctl+0x1a1/0x13d0
       ksys_ioctl+0x60/0x90
       __x64_sys_ioctl+0x6f/0xb0
       do_syscall_64+0x147/0x440
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Freed by task 1124:
       __kasan_slab_free+0x12e/0x180
       kfree+0x92/0x1a0
       drm_atomic_state_default_clear+0x315/0xc40 [drm]
       __drm_atomic_state_free+0x35/0xd0 [drm]
       drm_atomic_helper_update_plane+0xac/0x350 [drm_kms_helper]
       __setplane_internal+0x2d6/0x840 [drm]
       drm_mode_cursor_universal+0x41e/0xbe0 [drm]
       drm_mode_cursor_common+0x49f/0x880 [drm]
       drm_mode_cursor_ioctl+0xd8/0x130 [drm]
       drm_ioctl_kernel+0x1d4/0x260 [drm]
       drm_ioctl+0x433/0x920 [drm]
       amdgpu_drm_ioctl+0x11d/0x290 [amdgpu]
       do_vfs_ioctl+0x1a1/0x13d0
       ksys_ioctl+0x60/0x90
       __x64_sys_ioctl+0x6f/0xb0
       do_syscall_64+0x147/0x440
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      The buggy address belongs to the object at ffff8803a697b068
       which belongs to the cache kmalloc-1024 of size 1024
      The buggy address is located 9 bytes inside of
       1024-byte region [ffff8803a697b068, ffff8803a697b468)
      The buggy address belongs to the page:
      page:ffffea000e9a5e00 count:1 mapcount:0 mapping:ffff88041e00efc0 index:0x0 compound_mapcount: 0
      flags: 0x8000000000008100(slab|head)
      raw: 8000000000008100 ffffea000ecbc208 ffff88041e000c70 ffff88041e00efc0
      raw: 0000000000000000 0000000000170017 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8803a697af00: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff8803a697af80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      >ffff8803a697b000: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb
                                                                   ^
       ffff8803a697b080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8803a697b100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
      
      So, we fix this by counting the number of CRTCs this atomic commit disabled
      early on in the function before their atomic states have been freed, then use
      that count later to do the appropriate number of RPM puts at the end of the
      function.
      Acked-by: NMichel Dänzer <michel.daenzer@amd.com>
      Reviewed-by: NHarry Wentland <harry.wentland@amd.com>
      Cc: stable@vger.kernel.org
      Fixes: 97028037 ("drm/amdgpu: Grab/put runtime PM references in atomic_commit_tail()")
      Signed-off-by: NLyude Paul <lyude@redhat.com>
      Cc: Michel Dänzer <michel@daenzer.net>
      Reported-by: NMichel Dänzer <michel@daenzer.net>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      fe2a1965
  14. 16 6月, 2018 3 次提交
  15. 14 6月, 2018 2 次提交
    • P
      drm/amd/display: Fix stale buffer object (bo) use · 4b3c641b
      Pratik Vishwakarma 提交于
      Fixes stale buffer object (bo) usage for cursor plane
      
      Cursor plane's bo operations are handled in DC code.
      Currently, atomic_commit() does not handle bo operations
      for cursor plane, as a result the bo assigned for cursor
      plane in dm_plane_helper_prepare_fb() is not coherent
      with the updates to the same made in dc code.This mismatch
      leads to "bo" corruption and hence crashes during S3 entry.
      
      This patch cleans up the code which was added as a hack
      for 4.9 version only.
      Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com>
      Signed-off-by: NPratik Vishwakarma <Pratik.Vishwakarma@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      4b3c641b
    • L
      drm/amdgpu: Grab/put runtime PM references in atomic_commit_tail() · 97028037
      Lyude Paul 提交于
      So, unfortunately I recently made the discovery that in the upstream
      kernel, the only reason that amdgpu is not currently suffering from
      issues with runtime PM putting the GPU into suspend while it's driving
      displays is due to the fact that on most prime systems, we have sound
      devices associated with the GPU that hold their own runtime PM ref for
      the GPU.
      
      What this means however, is that in the event that there isn't any kind
      of sound device active (which can easily be reproduced by building a
      kernel with sound drivers disabled), the GPU will fall asleep even when
      there's displays active. This appears to be in part due to the fact that
      amdgpu has not actually ever relied on it's rpm_idle() function to be
      the only thing keeping it running, and normally grabs it's own power
      references whenever there are displays active (as can be seen with the
      original pre-DC codepath in amdgpu_display_crtc_set_config() in
      amdgpu_display.c). This means it's very likely that this bug was
      introduced during the switch over the DC.
      
      So to fix this, we start grabbing runtime PM references every time we
      enable a previously disabled CRTC in atomic_commit_tail(). This appears
      to be the correct solution, as it matches up with what i915 does in
      i915/intel_runtime_pm.c.
      
      The one sideaffect of this is that we ignore the variable that the
      pre-DC code used to use for tracking when it needed runtime PM refs,
      adev->have_disp_power_ref. This is mainly because there's no way for a
      driver to tell whether or not all of it's CRTCs are enabled or disabled
      when we've begun committing an atomic state, as there may be CRTC
      commits happening in parallel that aren't contained within the atomic
      state being committed. So, it's safer to just get/put a reference for
      each CRTC being enabled or disabled in the new atomic state.
      Signed-off-by: NLyude Paul <lyude@redhat.com>
      Acked-by: Christian König <christian.koenig@amd.com>.
      Reviewed-by: NHarry Wentland <harry.wentland@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      97028037
  16. 12 6月, 2018 1 次提交
  17. 01 6月, 2018 2 次提交
  18. 30 5月, 2018 5 次提交
  19. 19 5月, 2018 2 次提交
  20. 17 5月, 2018 1 次提交