1. 22 5月, 2020 2 次提交
  2. 13 2月, 2020 2 次提交
  3. 25 10月, 2019 1 次提交
  4. 23 8月, 2019 1 次提交
  5. 06 8月, 2019 2 次提交
  6. 01 5月, 2019 1 次提交
  7. 20 2月, 2019 3 次提交
  8. 24 1月, 2019 1 次提交
  9. 12 1月, 2019 1 次提交
  10. 11 10月, 2018 1 次提交
    • L
      drm/nouveau: Move backlight device into nouveau_connector · 6d757753
      Lyude Paul 提交于
      Currently module unloading is broken in nouveau due to a rather annoying
      race condition resulting from nouveau_backlight.c having gone a bit
      stale over time:
      
      [ 1960.791143] ==================================================================
      [ 1960.791394] BUG: KASAN: use-after-free in nouveau_backlight_exit+0x112/0x150 [nouveau]
      [ 1960.791460] Read of size 4 at addr ffff88075accf350 by task zsh/11185
      [ 1960.791521]
      [ 1960.791545] CPU: 7 PID: 11185 Comm: zsh Kdump: loaded Tainted: G           O      4.18.0Lyude-Test+ #4
      [ 1960.791580] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET79W (1.52 ) 07/13/2018
      [ 1960.791628] Call Trace:
      [ 1960.791680]  dump_stack+0xa4/0xfd
      [ 1960.791721]  print_address_description+0x71/0x239
      [ 1960.791833]  ? nouveau_backlight_exit+0x112/0x150 [nouveau]
      [ 1960.791877]  kasan_report.cold.6+0x242/0x2fe
      [ 1960.791919]  __asan_report_load4_noabort+0x19/0x20
      [ 1960.792012]  nouveau_backlight_exit+0x112/0x150 [nouveau]
      [ 1960.792081]  nouveau_display_destroy+0x76/0x150 [nouveau]
      [ 1960.792150]  nouveau_drm_device_fini+0xb7/0x190 [nouveau]
      [ 1960.792265]  nouveau_drm_device_remove+0x14b/0x1d0 [nouveau]
      [ 1960.792347]  ? nouveau_cli_work_queue+0x2e0/0x2e0 [nouveau]
      [ 1960.792378]  ? trace_hardirqs_on_caller+0x38b/0x570
      [ 1960.792406]  ? trace_hardirqs_on+0xd/0x10
      [ 1960.792472]  nouveau_drm_remove+0x37/0x50 [nouveau]
      [ 1960.792502]  pci_device_remove+0x112/0x2d0
      [ 1960.792530]  ? pcibios_free_irq+0x10/0x10
      [ 1960.792558]  ? kasan_check_write+0x14/0x20
      [ 1960.792587]  device_release_driver_internal+0x35c/0x650
      [ 1960.792617]  device_release_driver+0x12/0x20
      [ 1960.792643]  pci_stop_bus_device+0x172/0x1e0
      [ 1960.792671]  pci_stop_and_remove_bus_device_locked+0x1a/0x30
      [ 1960.792715]  remove_store+0xcb/0xe0
      [ 1960.792753]  ? sriov_numvfs_store+0x2e0/0x2e0
      [ 1960.792779]  ? __lock_is_held+0xb5/0x140
      [ 1960.792808]  ? component_add+0x530/0x530
      [ 1960.792834]  dev_attr_store+0x3f/0x70
      [ 1960.792859]  ? sysfs_file_ops+0x11d/0x170
      [ 1960.792885]  sysfs_kf_write+0x104/0x150
      [ 1960.792915]  ? sysfs_file_ops+0x170/0x170
      [ 1960.792940]  kernfs_fop_write+0x24f/0x400
      [ 1960.792978]  ? __lock_acquire+0x6ea/0x47f0
      [ 1960.793021]  __vfs_write+0xeb/0x760
      [ 1960.793048]  ? kernel_read+0x130/0x130
      [ 1960.793076]  ? __lock_is_held+0xb5/0x140
      [ 1960.793107]  ? rcu_read_lock_sched_held+0xdd/0x110
      [ 1960.793135]  ? rcu_sync_lockdep_assert+0x78/0xb0
      [ 1960.793162]  ? __sb_start_write+0x183/0x220
      [ 1960.793189]  vfs_write+0x14d/0x4a0
      [ 1960.793229]  ksys_write+0xd2/0x1b0
      [ 1960.793255]  ? __ia32_sys_read+0xb0/0xb0
      [ 1960.793298]  ? fput+0x1d/0x120
      [ 1960.793324]  ? filp_close+0xf3/0x130
      [ 1960.793349]  ? entry_SYSCALL_64_after_hwframe+0x59/0xbe
      [ 1960.793380]  __x64_sys_write+0x73/0xb0
      [ 1960.793407]  do_syscall_64+0xaa/0x400
      [ 1960.793433]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 1960.793460] RIP: 0033:0x7f59df433164
      [ 1960.793486] Code: 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8d 05 81 38 2d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53
      [ 1960.793541] RSP: 002b:00007ffd70ee2fb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 1960.793576] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f59df433164
      [ 1960.793620] RDX: 0000000000000002 RSI: 00005578088640c0 RDI: 0000000000000001
      [ 1960.793665] RBP: 00005578088640c0 R08: 00007f59df7038c0 R09: 00007f59e0995b80
      [ 1960.793696] R10: 000000000000000a R11: 0000000000000246 R12: 00007f59df702760
      [ 1960.793730] R13: 0000000000000002 R14: 00007f59df6fd760 R15: 0000000000000002
      [ 1960.793768]
      [ 1960.793790] Allocated by task 11167:
      [ 1960.793816]  save_stack+0x43/0xd0
      [ 1960.793841]  kasan_kmalloc+0xc4/0xe0
      [ 1960.793880]  kasan_slab_alloc+0x11/0x20
      [ 1960.793905]  kmem_cache_alloc+0xd7/0x270
      [ 1960.793944]  getname_flags+0xbd/0x520
      [ 1960.793969]  user_path_at_empty+0x23/0x50
      [ 1960.793994]  do_faccessat+0x1fc/0x5d0
      [ 1960.794018]  __x64_sys_access+0x59/0x80
      [ 1960.794043]  do_syscall_64+0xaa/0x400
      [ 1960.794067]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 1960.794093]
      [ 1960.794127] Freed by task 11167:
      [ 1960.794152]  save_stack+0x43/0xd0
      [ 1960.794190]  __kasan_slab_free+0x139/0x190
      [ 1960.794215]  kasan_slab_free+0xe/0x10
      [ 1960.794239]  kmem_cache_free+0xcb/0x2c0
      [ 1960.794264]  putname+0xad/0xe0
      [ 1960.794287]  filename_lookup.part.59+0x1f1/0x360
      [ 1960.794313]  user_path_at_empty+0x3e/0x50
      [ 1960.794338]  do_faccessat+0x1fc/0x5d0
      [ 1960.794362]  __x64_sys_access+0x59/0x80
      [ 1960.794393]  do_syscall_64+0xaa/0x400
      [ 1960.794421]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 1960.794461]
      [ 1960.794483] The buggy address belongs to the object at ffff88075acceac0
      [ 1960.794483]  which belongs to the cache names_cache of size 4096
      [ 1960.794540] The buggy address is located 2192 bytes inside of
      [ 1960.794540]  4096-byte region [ffff88075acceac0, ffff88075accfac0)
      [ 1960.794581] The buggy address belongs to the page:
      [ 1960.794609] page:ffffea001d6b3200 count:1 mapcount:0 mapping:ffff880778e4b1c0 index:0x0 compound_mapcount: 0
      [ 1960.794651] flags: 0x8000000000008100(slab|head)
      [ 1960.794679] raw: 8000000000008100 ffffea001d39e808 ffffea001d39ea08 ffff880778e4b1c0
      [ 1960.794739] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
      [ 1960.794785] page dumped because: kasan: bad access detected
      [ 1960.794813]
      [ 1960.794834] Memory state around the buggy address:
      [ 1960.794861]  ffff88075accf200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1960.794894]  ffff88075accf280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1960.794925] >ffff88075accf300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1960.794956]                                                  ^
      [ 1960.794985]  ffff88075accf380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1960.795017]  ffff88075accf400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 1960.795061] ==================================================================
      [ 1960.795106] Disabling lock debugging due to kernel taint
      [ 1960.795131] ------------[ cut here ]------------
      [ 1960.795148] ida_remove called for id=1802201963 which is not allocated.
      [ 1960.795193] WARNING: CPU: 7 PID: 11185 at lib/idr.c:521 ida_remove+0x184/0x210
      [ 1960.795213] Modules linked in: nouveau(O) mxm_wmi ttm i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm joydev vfat fat intel_rapl x86_pkg_temp_thermal coretemp crc32_pclmul iTCO_wdt psmouse wmi_bmof mei_me tpm_tis mei tpm_tis_core tpm i2c_i801 thinkpad_acpi pcc_cpufreq crc32c_intel serio_raw xhci_pci xhci_hcd wmi video i2c_dev i2c_core
      [ 1960.795305] CPU: 7 PID: 11185 Comm: zsh Kdump: loaded Tainted: G    B      O      4.18.0Lyude-Test+ #4
      [ 1960.795330] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET79W (1.52 ) 07/13/2018
      [ 1960.795352] RIP: 0010:ida_remove+0x184/0x210
      [ 1960.795370] Code: 4c 89 f7 e8 ae c8 00 00 eb 22 41 83 c4 02 4c 89 e8 41 83 fc 3f 0f 86 64 ff ff ff 44 89 fe 48 c7 c7 20 94 1e 83 e8 54 ed 81 fe <0f> 0b 48 b8 00 00 00 00 00 fc ff df 48 01 c3 c7 03 00 00 00 00 c7
      [ 1960.795402] RSP: 0018:ffff88074d4df7b8 EFLAGS: 00010082
      [ 1960.795421] RAX: 0000000000000000 RBX: 1ffff100e9a9befa RCX: ffffffff81479975
      [ 1960.795440] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88077c1de690
      [ 1960.795460] RBP: ffff88074d4df878 R08: ffffed00ef83bcd3 R09: ffffed00ef83bcd2
      [ 1960.795479] R10: ffffed00ef83bcd2 R11: ffff88077c1de697 R12: 000000000000036b
      [ 1960.795498] R13: 0000000000000202 R14: ffffffffa0aa7fa0 R15: 000000006b6b6b6b
      [ 1960.795518] FS:  00007f59e0995b80(0000) GS:ffff88077c1c0000(0000) knlGS:0000000000000000
      [ 1960.795553] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1960.795571] CR2: 00007f59e09a2010 CR3: 00000004a1a70005 CR4: 00000000003606e0
      [ 1960.795596] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1960.795629] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1960.795649] Call Trace:
      [ 1960.795667]  ? ida_destroy+0x1d0/0x1d0
      [ 1960.795686]  ? kasan_check_write+0x14/0x20
      [ 1960.795704]  ? do_raw_spin_lock+0xc2/0x1c0
      [ 1960.795724]  ida_simple_remove+0x26/0x40
      [ 1960.795794]  nouveau_backlight_exit+0x9d/0x150 [nouveau]
      [ 1960.795867]  nouveau_display_destroy+0x76/0x150 [nouveau]
      [ 1960.795930]  nouveau_drm_device_fini+0xb7/0x190 [nouveau]
      [ 1960.795989]  nouveau_drm_device_remove+0x14b/0x1d0 [nouveau]
      [ 1960.796047]  ? nouveau_cli_work_queue+0x2e0/0x2e0 [nouveau]
      [ 1960.796067]  ? trace_hardirqs_on_caller+0x38b/0x570
      [ 1960.796089]  ? trace_hardirqs_on+0xd/0x10
      [ 1960.796146]  nouveau_drm_remove+0x37/0x50 [nouveau]
      [ 1960.796167]  pci_device_remove+0x112/0x2d0
      [ 1960.796186]  ? pcibios_free_irq+0x10/0x10
      [ 1960.796218]  ? kasan_check_write+0x14/0x20
      [ 1960.796237]  device_release_driver_internal+0x35c/0x650
      [ 1960.796257]  device_release_driver+0x12/0x20
      [ 1960.796289]  pci_stop_bus_device+0x172/0x1e0
      [ 1960.796308]  pci_stop_and_remove_bus_device_locked+0x1a/0x30
      [ 1960.796328]  remove_store+0xcb/0xe0
      [ 1960.796345]  ? sriov_numvfs_store+0x2e0/0x2e0
      [ 1960.796364]  ? __lock_is_held+0xb5/0x140
      [ 1960.796383]  ? component_add+0x530/0x530
      [ 1960.796401]  dev_attr_store+0x3f/0x70
      [ 1960.796419]  ? sysfs_file_ops+0x11d/0x170
      [ 1960.796436]  sysfs_kf_write+0x104/0x150
      [ 1960.796454]  ? sysfs_file_ops+0x170/0x170
      [ 1960.796471]  kernfs_fop_write+0x24f/0x400
      [ 1960.796488]  ? __lock_acquire+0x6ea/0x47f0
      [ 1960.796520]  __vfs_write+0xeb/0x760
      [ 1960.796538]  ? kernel_read+0x130/0x130
      [ 1960.796556]  ? __lock_is_held+0xb5/0x140
      [ 1960.796590]  ? rcu_read_lock_sched_held+0xdd/0x110
      [ 1960.796608]  ? rcu_sync_lockdep_assert+0x78/0xb0
      [ 1960.796626]  ? __sb_start_write+0x183/0x220
      [ 1960.796648]  vfs_write+0x14d/0x4a0
      [ 1960.796666]  ksys_write+0xd2/0x1b0
      [ 1960.796684]  ? __ia32_sys_read+0xb0/0xb0
      [ 1960.796701]  ? fput+0x1d/0x120
      [ 1960.796732]  ? filp_close+0xf3/0x130
      [ 1960.796749]  ? entry_SYSCALL_64_after_hwframe+0x59/0xbe
      [ 1960.796768]  __x64_sys_write+0x73/0xb0
      [ 1960.796800]  do_syscall_64+0xaa/0x400
      [ 1960.796818]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [ 1960.796836] RIP: 0033:0x7f59df433164
      [ 1960.796854] Code: 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8d 05 81 38 2d 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 49 89 d4 55 48 89 f5 53
      [ 1960.796884] RSP: 002b:00007ffd70ee2fb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 1960.796906] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f59df433164
      [ 1960.796926] RDX: 0000000000000002 RSI: 00005578088640c0 RDI: 0000000000000001
      [ 1960.796946] RBP: 00005578088640c0 R08: 00007f59df7038c0 R09: 00007f59e0995b80
      [ 1960.796966] R10: 000000000000000a R11: 0000000000000246 R12: 00007f59df702760
      [ 1960.796985] R13: 0000000000000002 R14: 00007f59df6fd760 R15: 0000000000000002
      [ 1960.797008] irq event stamp: 509990
      [ 1960.797026] hardirqs last  enabled at (509989): [<ffffffff8119ff78>] flush_work+0x4b8/0x6d0
      [ 1960.797063] hardirqs last disabled at (509990): [<ffffffff8297c395>] _raw_spin_lock_irqsave+0x25/0x60
      [ 1960.797085] softirqs last  enabled at (509744): [<ffffffff82c005ad>] __do_softirq+0x5ad/0x8c0
      [ 1960.797121] softirqs last disabled at (509735): [<ffffffff8115aa15>] irq_exit+0x1a5/0x1e0
      [ 1960.797142] ---[ end trace fb1342325f1846b8 ]---
      
      While I haven't actually gone into the details of what's causing this to
      happen (maybe the kernel removes the backlight device in the device core
      before we get to it?), it doesn't really matter anyway because the way
      nouveau handles backlights has long since been deprecated.
      
      According to the documentation on the drm_connector->late_register()
      hook, the ->late_register() hook should be used for adding extra
      connector-related devices. Vice versa, the ->early_unregister() hook is
      meant to be used for removing those devices.
      
      So: gut nouveau_drm->bl_list and nouveau_drm->backlight, and replace
      them with per-connector backlight structures. Additionally, move
      backlight registration/teardown into the ->late_register() and
      ->early_unregister() hooks so that DRM can give us a chance to remove
      the backlight before the connector is even removed. This appears to fix
      the problem once and for all.
      
      Changes since v2:
      - Use NV_INFO_ONCE for printing GMUX information, since otherwise this
        will end up printing that message for as many times as we have
        connectors
      Signed-off-by: NLyude Paul <lyude@redhat.com>
      Reviewed-by: NKarol Herbst <kherbst@redhat.com>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      6d757753
  11. 07 9月, 2018 4 次提交
    • L
      drm/nouveau/drm/nouveau: Don't forget to cancel hpd_work on suspend/unload · 2f7ca781
      Lyude Paul 提交于
      Currently, there's nothing in nouveau that actually cancels this work
      struct. So, cancel it on suspend/unload. Otherwise, if we're unlucky
      enough hpd_work might try to keep running up until the system is
      suspended.
      Signed-off-by: NLyude Paul <lyude@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      2f7ca781
    • L
      drm/nouveau/drm/nouveau: Prevent handling ACPI HPD events too early · 79e765ad
      Lyude Paul 提交于
      On most systems with ACPI hotplugging support, it seems that we always
      receive a hotplug event once we re-enable EC interrupts even if the GPU
      hasn't even been resumed yet.
      
      This can cause problems since even though we schedule hpd_work to handle
      connector reprobing for us, hpd_work synchronizes on
      pm_runtime_get_sync() to wait until the device is ready to perform
      reprobing. Since runtime suspend/resume callbacks are disabled before
      the PM core calls ->suspend(), any calls to pm_runtime_get_sync() during
      this period will grab a runtime PM ref and return immediately with
      -EACCES. Because we schedule hpd_work from our ACPI HPD handler, and
      hpd_work synchronizes on pm_runtime_get_sync(), this causes us to launch
      a connector reprobe immediately even if the GPU isn't actually resumed
      just yet. This causes various warnings in dmesg and occasionally, also
      prevents some displays connected to the dedicated GPU from coming back
      up after suspend. Example:
      
      usb 1-4: USB disconnect, device number 14
      usb 1-4.1: USB disconnect, device number 15
      WARNING: CPU: 0 PID: 838 at drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:170 nouveau_dp_detect+0x17e/0x370 [nouveau]
      CPU: 0 PID: 838 Comm: kworker/0:6 Not tainted 4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 #1
      Hardware name: LENOVO 20EQS64N00/20EQS64N00, BIOS N1EET77W (1.50 ) 03/28/2018
      Workqueue: events nouveau_display_hpd_work [nouveau]
      RIP: 0010:nouveau_dp_detect+0x17e/0x370 [nouveau]
      RSP: 0018:ffffa15143933cf0 EFLAGS: 00010293
      RAX: 0000000000000000 RBX: ffff8cb4f656c400 RCX: 0000000000000000
      RDX: ffffa1514500e4e4 RSI: ffffa1514500e4e4 RDI: 0000000001009002
      RBP: ffff8cb4f4a8a800 R08: ffffa15143933cfd R09: ffffa15143933cfc
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8cb4fb57a000
      R13: ffff8cb4fb57a000 R14: ffff8cb4f4a8f800 R15: ffff8cb4f656c418
      FS:  0000000000000000(0000) GS:ffff8cb51f400000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f78ec938000 CR3: 000000073720a003 CR4: 00000000003606f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       ? _cond_resched+0x15/0x30
       nouveau_connector_detect+0x2ce/0x520 [nouveau]
       ? _cond_resched+0x15/0x30
       ? ww_mutex_lock+0x12/0x40
       drm_helper_probe_detect_ctx+0x8b/0xe0 [drm_kms_helper]
       drm_helper_hpd_irq_event+0xa8/0x120 [drm_kms_helper]
       nouveau_display_hpd_work+0x2a/0x60 [nouveau]
       process_one_work+0x187/0x340
       worker_thread+0x2e/0x380
       ? pwq_unbound_release_workfn+0xd0/0xd0
       kthread+0x112/0x130
       ? kthread_create_worker_on_cpu+0x70/0x70
       ret_from_fork+0x35/0x40
      Code: 4c 8d 44 24 0d b9 00 05 00 00 48 89 ef ba 09 00 00 00 be 01 00 00 00 e8 e1 09 f8 ff 85 c0 0f 85 b2 01 00 00 80 7c 24 0c 03 74 02 <0f> 0b 48 89 ef e8 b8 07 f8 ff f6 05 51 1b c8 ff 02 0f 84 72 ff
      ---[ end trace 55d811b38fc8e71a ]---
      
      So, to fix this we attempt to grab a runtime PM reference in the ACPI
      handler itself asynchronously. If the GPU is already awake (it will have
      normal hotplugging at this point) or runtime PM callbacks are currently
      disabled on the device, we drop our reference without updating the
      autosuspend delay. We only schedule connector reprobes when we
      successfully managed to queue up a resume request with our asynchronous
      PM ref.
      
      This also has the added benefit of preventing redundant connector
      reprobes from ACPI while the GPU is runtime resumed!
      Signed-off-by: NLyude Paul <lyude@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: Karol Herbst <kherbst@redhat.com>
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1477182#c41Signed-off-by: NLyude Paul <lyude@redhat.com>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      79e765ad
    • L
      drm/nouveau/drm/nouveau: Fix deadlock with fb_helper with async RPM requests · 7fec8f53
      Lyude Paul 提交于
      Currently, nouveau uses the generic drm_fb_helper_output_poll_changed()
      function provided by DRM as it's output_poll_changed callback.
      Unfortunately however, this function doesn't grab runtime PM references
      early enough and even if it did-we can't block waiting for the device to
      resume in output_poll_changed() since it's very likely that we'll need
      to grab the fb_helper lock at some point during the runtime resume
      process. This currently results in deadlocking like so:
      
      [  246.669625] INFO: task kworker/4:0:37 blocked for more than 120 seconds.
      [  246.673398]       Not tainted 4.18.0-rc5Lyude-Test+ #2
      [  246.675271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  246.676527] kworker/4:0     D    0    37      2 0x80000000
      [  246.677580] Workqueue: events output_poll_execute [drm_kms_helper]
      [  246.678704] Call Trace:
      [  246.679753]  __schedule+0x322/0xaf0
      [  246.680916]  schedule+0x33/0x90
      [  246.681924]  schedule_preempt_disabled+0x15/0x20
      [  246.683023]  __mutex_lock+0x569/0x9a0
      [  246.684035]  ? kobject_uevent_env+0x117/0x7b0
      [  246.685132]  ? drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
      [  246.686179]  mutex_lock_nested+0x1b/0x20
      [  246.687278]  ? mutex_lock_nested+0x1b/0x20
      [  246.688307]  drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
      [  246.689420]  drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
      [  246.690462]  drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
      [  246.691570]  output_poll_execute+0x198/0x1c0 [drm_kms_helper]
      [  246.692611]  process_one_work+0x231/0x620
      [  246.693725]  worker_thread+0x214/0x3a0
      [  246.694756]  kthread+0x12b/0x150
      [  246.695856]  ? wq_pool_ids_show+0x140/0x140
      [  246.696888]  ? kthread_create_worker_on_cpu+0x70/0x70
      [  246.697998]  ret_from_fork+0x3a/0x50
      [  246.699034] INFO: task kworker/0:1:60 blocked for more than 120 seconds.
      [  246.700153]       Not tainted 4.18.0-rc5Lyude-Test+ #2
      [  246.701182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  246.702278] kworker/0:1     D    0    60      2 0x80000000
      [  246.703293] Workqueue: pm pm_runtime_work
      [  246.704393] Call Trace:
      [  246.705403]  __schedule+0x322/0xaf0
      [  246.706439]  ? wait_for_completion+0x104/0x190
      [  246.707393]  schedule+0x33/0x90
      [  246.708375]  schedule_timeout+0x3a5/0x590
      [  246.709289]  ? mark_held_locks+0x58/0x80
      [  246.710208]  ? _raw_spin_unlock_irq+0x2c/0x40
      [  246.711222]  ? wait_for_completion+0x104/0x190
      [  246.712134]  ? trace_hardirqs_on_caller+0xf4/0x190
      [  246.713094]  ? wait_for_completion+0x104/0x190
      [  246.713964]  wait_for_completion+0x12c/0x190
      [  246.714895]  ? wake_up_q+0x80/0x80
      [  246.715727]  ? get_work_pool+0x90/0x90
      [  246.716649]  flush_work+0x1c9/0x280
      [  246.717483]  ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
      [  246.718442]  __cancel_work_timer+0x146/0x1d0
      [  246.719247]  cancel_delayed_work_sync+0x13/0x20
      [  246.720043]  drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper]
      [  246.721123]  nouveau_pmops_runtime_suspend+0x3d/0xb0 [nouveau]
      [  246.721897]  pci_pm_runtime_suspend+0x6b/0x190
      [  246.722825]  ? pci_has_legacy_pm_support+0x70/0x70
      [  246.723737]  __rpm_callback+0x7a/0x1d0
      [  246.724721]  ? pci_has_legacy_pm_support+0x70/0x70
      [  246.725607]  rpm_callback+0x24/0x80
      [  246.726553]  ? pci_has_legacy_pm_support+0x70/0x70
      [  246.727376]  rpm_suspend+0x142/0x6b0
      [  246.728185]  pm_runtime_work+0x97/0xc0
      [  246.728938]  process_one_work+0x231/0x620
      [  246.729796]  worker_thread+0x44/0x3a0
      [  246.730614]  kthread+0x12b/0x150
      [  246.731395]  ? wq_pool_ids_show+0x140/0x140
      [  246.732202]  ? kthread_create_worker_on_cpu+0x70/0x70
      [  246.732878]  ret_from_fork+0x3a/0x50
      [  246.733768] INFO: task kworker/4:2:422 blocked for more than 120 seconds.
      [  246.734587]       Not tainted 4.18.0-rc5Lyude-Test+ #2
      [  246.735393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  246.736113] kworker/4:2     D    0   422      2 0x80000080
      [  246.736789] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
      [  246.737665] Call Trace:
      [  246.738490]  __schedule+0x322/0xaf0
      [  246.739250]  schedule+0x33/0x90
      [  246.739908]  rpm_resume+0x19c/0x850
      [  246.740750]  ? finish_wait+0x90/0x90
      [  246.741541]  __pm_runtime_resume+0x4e/0x90
      [  246.742370]  nv50_disp_atomic_commit+0x31/0x210 [nouveau]
      [  246.743124]  drm_atomic_commit+0x4a/0x50 [drm]
      [  246.743775]  restore_fbdev_mode_atomic+0x1c8/0x240 [drm_kms_helper]
      [  246.744603]  restore_fbdev_mode+0x31/0x140 [drm_kms_helper]
      [  246.745373]  drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper]
      [  246.746220]  drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper]
      [  246.746884]  drm_fb_helper_hotplug_event.part.28+0x96/0xb0 [drm_kms_helper]
      [  246.747675]  drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
      [  246.748544]  drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
      [  246.749439]  nv50_mstm_hotplug+0x15/0x20 [nouveau]
      [  246.750111]  drm_dp_send_link_address+0x177/0x1c0 [drm_kms_helper]
      [  246.750764]  drm_dp_check_and_send_link_address+0xa8/0xd0 [drm_kms_helper]
      [  246.751602]  drm_dp_mst_link_probe_work+0x51/0x90 [drm_kms_helper]
      [  246.752314]  process_one_work+0x231/0x620
      [  246.752979]  worker_thread+0x44/0x3a0
      [  246.753838]  kthread+0x12b/0x150
      [  246.754619]  ? wq_pool_ids_show+0x140/0x140
      [  246.755386]  ? kthread_create_worker_on_cpu+0x70/0x70
      [  246.756162]  ret_from_fork+0x3a/0x50
      [  246.756847]
                 Showing all locks held in the system:
      [  246.758261] 3 locks held by kworker/4:0/37:
      [  246.759016]  #0: 00000000f8df4d2d ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.759856]  #1: 00000000e6065461 ((work_completion)(&(&dev->mode_config.output_poll_work)->work)){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.760670]  #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
      [  246.761516] 2 locks held by kworker/0:1/60:
      [  246.762274]  #0: 00000000fff6be0f ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.762982]  #1: 000000005ab44fb4 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.763890] 1 lock held by khungtaskd/64:
      [  246.764664]  #0: 000000008cb8b5c3 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
      [  246.765588] 5 locks held by kworker/4:2/422:
      [  246.766440]  #0: 00000000232f0959 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.767390]  #1: 00000000bb59b134 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x1b3/0x620
      [  246.768154]  #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_restore_fbdev_mode_unlocked+0x4c/0xb0 [drm_kms_helper]
      [  246.768966]  #3: 000000004c8f0b6b (crtc_ww_class_acquire){+.+.}, at: restore_fbdev_mode_atomic+0x4b/0x240 [drm_kms_helper]
      [  246.769921]  #4: 000000004c34a296 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8a/0x1b0 [drm]
      [  246.770839] 1 lock held by dmesg/1038:
      [  246.771739] 2 locks held by zsh/1172:
      [  246.772650]  #0: 00000000836d0438 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
      [  246.773680]  #1: 000000001f4f4d48 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
      
      [  246.775522] =============================================
      
      After trying dozens of different solutions, I found one very simple one
      that should also have the benefit of preventing us from having to fight
      locking for the rest of our lives. So, we work around these deadlocks by
      deferring all fbcon hotplug events that happen after the runtime suspend
      process starts until after the device is resumed again.
      
      Changes since v7:
       - Fixup commit message - Daniel Vetter
      
      Changes since v6:
       - Remove unused nouveau_fbcon_hotplugged_in_suspend() - Ilia
      
      Changes since v5:
       - Come up with the (hopefully final) solution for solving this dumb
         problem, one that is a lot less likely to cause issues with locking in
         the future. This should work around all deadlock conditions with fbcon
         brought up thus far.
      
      Changes since v4:
       - Add nouveau_fbcon_hotplugged_in_suspend() to workaround deadlock
         condition that Lukas described
       - Just move all of this out of drm_fb_helper. It seems that other DRM
         drivers have already figured out other workarounds for this. If other
         drivers do end up needing this in the future, we can just move this
         back into drm_fb_helper again.
      
      Changes since v3:
      - Actually check if fb_helper is NULL in both new helpers
      - Actually check drm_fbdev_emulation in both new helpers
      - Don't fire off a fb_helper hotplug unconditionally; only do it if
        the following conditions are true (as otherwise, calling this in the
        wrong spot will cause Bad Things to happen):
        - fb_helper hotplug handling was actually inhibited previously
        - fb_helper actually has a delayed hotplug pending
        - fb_helper is actually bound
        - fb_helper is actually initialized
      - Add __must_check to drm_fb_helper_suspend_hotplug(). There's no
        situation where a driver would actually want to use this without
        checking the return value, so enforce that
      - Rewrite and clarify the documentation for both helpers.
      - Make sure to return true in the drm_fb_helper_suspend_hotplug() stub
        that's provided in drm_fb_helper.h when CONFIG_DRM_FBDEV_EMULATION
        isn't enabled
      - Actually grab the toplevel fb_helper lock in
        drm_fb_helper_resume_hotplug(), since it's possible other activity
        (such as a hotplug) could be going on at the same time the driver
        calls drm_fb_helper_resume_hotplug(). We need this to check whether or
        not drm_fb_helper_hotplug_event() needs to be called anyway
      Signed-off-by: NLyude Paul <lyude@redhat.com>
      Reviewed-by: NKarol Herbst <kherbst@redhat.com>
      Acked-by: NDaniel Vetter <daniel@ffwll.ch>
      Cc: stable@vger.kernel.org
      Cc: Lukas Wunner <lukas@wunner.de>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      7fec8f53
    • L
      drm/nouveau/drm/nouveau: Fix bogus drm_kms_helper_poll_enable() placement · d77ef138
      Lyude Paul 提交于
      Turns out this part is my fault for not noticing when reviewing
      9a2eba33 ("drm/nouveau: Fix drm poll_helper handling"). Currently
      we call drm_kms_helper_poll_enable() from nouveau_display_hpd_work().
      This makes basically no sense however, because that means we're calling
      drm_kms_helper_poll_enable() every time we schedule the hotplug
      detection work. This is also against the advice mentioned in
      drm_kms_helper_poll_enable()'s documentation:
      
       Note that calls to enable and disable polling must be strictly ordered,
       which is automatically the case when they're only call from
       suspend/resume callbacks.
      
      Of course, hotplugs can't really be ordered. They could even happen
      immediately after we called drm_kms_helper_poll_disable() in
      nouveau_display_fini(), which can lead to all sorts of issues.
      
      Additionally; enabling polling /after/ we call
      drm_helper_hpd_irq_event() could also mean that we'd miss a hotplug
      event anyway, since drm_helper_hpd_irq_event() wouldn't bother trying to
      probe connectors so long as polling is disabled.
      
      So; simply move this back into nouveau_display_init() again. The race
      condition that both of these patches attempted to work around has
      already been fixed properly in
      
        d61a5c10 ("drm/nouveau: Fix deadlock on runtime suspend")
      
      Fixes: 9a2eba33 ("drm/nouveau: Fix drm poll_helper handling")
      Signed-off-by: NLyude Paul <lyude@redhat.com>
      Acked-by: NKarol Herbst <kherbst@redhat.com>
      Acked-by: NDaniel Vetter <daniel@ffwll.ch>
      Cc: Lukas Wunner <lukas@wunner.de>
      Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      d77ef138
  12. 16 7月, 2018 3 次提交
    • T
      drm/nouveau: Replace drm_gem_object_unreference_unlocked with put function · 743e0f07
      Thomas Zimmermann 提交于
      This patch unifies the naming of DRM functions for reference counting
      of struct drm_gem_object. The resulting code is more aligned with the
      rest of the Linux kernel interfaces.
      Signed-off-by: NThomas Zimmermann <tdz@users.sourceforge.net>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      743e0f07
    • L
      drm/nouveau: Avoid looping through fake MST connectors · 37afe55b
      Lyude Paul 提交于
      When MST and atomic were introduced to nouveau, another structure that
      could contain a drm_connector embedded within it was introduced; struct
      nv50_mstc. This meant that we no longer would be able to simply loop
      through our connector list and assume that nouveau_connector() would
      return a proper pointer for each connector, since the assertion that
      all connectors coming from nouveau have a full nouveau_connector struct
      became invalid.
      
      Unfortunately, none of the actual code that looped through connectors
      ever got updated, which means that we've been causing invalid memory
      accesses for quite a while now.
      
      An example that was caught by KASAN:
      
      [  201.038698] ==================================================================
      [  201.038792] BUG: KASAN: slab-out-of-bounds in nvif_notify_get+0x190/0x1a0 [nouveau]
      [  201.038797] Read of size 4 at addr ffff88076738c650 by task kworker/0:3/718
      [  201.038800]
      [  201.038822] CPU: 0 PID: 718 Comm: kworker/0:3 Tainted: G           O      4.18.0-rc4Lyude-Test+ #1
      [  201.038825] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51 ) 05/18/2018
      [  201.038882] Workqueue: events nouveau_display_hpd_work [nouveau]
      [  201.038887] Call Trace:
      [  201.038894]  dump_stack+0xa4/0xfd
      [  201.038900]  print_address_description+0x71/0x239
      [  201.038929]  ? nvif_notify_get+0x190/0x1a0 [nouveau]
      [  201.038935]  kasan_report.cold.6+0x242/0x2fe
      [  201.038942]  __asan_report_load4_noabort+0x19/0x20
      [  201.038970]  nvif_notify_get+0x190/0x1a0 [nouveau]
      [  201.038998]  ? nvif_notify_put+0x1f0/0x1f0 [nouveau]
      [  201.039003]  ? kmsg_dump_rewind_nolock+0xe4/0xe4
      [  201.039049]  nouveau_display_init.cold.12+0x34/0x39 [nouveau]
      [  201.039089]  ? nouveau_user_framebuffer_create+0x120/0x120 [nouveau]
      [  201.039133]  nouveau_display_resume+0x5c0/0x810 [nouveau]
      [  201.039173]  ? nvkm_client_ioctl+0x20/0x20 [nouveau]
      [  201.039215]  nouveau_do_resume+0x19f/0x570 [nouveau]
      [  201.039256]  nouveau_pmops_runtime_resume+0xd8/0x2a0 [nouveau]
      [  201.039264]  pci_pm_runtime_resume+0x130/0x250
      [  201.039269]  ? pci_restore_standard_config+0x70/0x70
      [  201.039275]  __rpm_callback+0x1f2/0x5d0
      [  201.039279]  ? rpm_resume+0x560/0x18a0
      [  201.039283]  ? pci_restore_standard_config+0x70/0x70
      [  201.039287]  ? pci_restore_standard_config+0x70/0x70
      [  201.039291]  ? pci_restore_standard_config+0x70/0x70
      [  201.039296]  rpm_callback+0x175/0x210
      [  201.039300]  ? pci_restore_standard_config+0x70/0x70
      [  201.039305]  rpm_resume+0xcc3/0x18a0
      [  201.039312]  ? rpm_callback+0x210/0x210
      [  201.039317]  ? __pm_runtime_resume+0x9e/0x100
      [  201.039322]  ? kasan_check_write+0x14/0x20
      [  201.039326]  ? do_raw_spin_lock+0xc2/0x1c0
      [  201.039333]  __pm_runtime_resume+0xac/0x100
      [  201.039374]  nouveau_display_hpd_work+0x67/0x1f0 [nouveau]
      [  201.039380]  process_one_work+0x7a0/0x14d0
      [  201.039388]  ? cancel_delayed_work_sync+0x20/0x20
      [  201.039392]  ? lock_acquire+0x113/0x310
      [  201.039398]  ? kasan_check_write+0x14/0x20
      [  201.039402]  ? do_raw_spin_lock+0xc2/0x1c0
      [  201.039409]  worker_thread+0x86/0xb50
      [  201.039418]  kthread+0x2e9/0x3a0
      [  201.039422]  ? process_one_work+0x14d0/0x14d0
      [  201.039426]  ? kthread_create_worker_on_cpu+0xc0/0xc0
      [  201.039431]  ret_from_fork+0x3a/0x50
      [  201.039441]
      [  201.039444] Allocated by task 79:
      [  201.039449]  save_stack+0x43/0xd0
      [  201.039452]  kasan_kmalloc+0xc4/0xe0
      [  201.039456]  kmem_cache_alloc_trace+0x10a/0x260
      [  201.039494]  nv50_mstm_add_connector+0x9a/0x340 [nouveau]
      [  201.039504]  drm_dp_add_port+0xff5/0x1fc0 [drm_kms_helper]
      [  201.039511]  drm_dp_send_link_address+0x4a7/0x740 [drm_kms_helper]
      [  201.039518]  drm_dp_check_and_send_link_address+0x1a7/0x210 [drm_kms_helper]
      [  201.039525]  drm_dp_mst_link_probe_work+0x71/0xb0 [drm_kms_helper]
      [  201.039529]  process_one_work+0x7a0/0x14d0
      [  201.039533]  worker_thread+0x86/0xb50
      [  201.039537]  kthread+0x2e9/0x3a0
      [  201.039541]  ret_from_fork+0x3a/0x50
      [  201.039543]
      [  201.039546] Freed by task 0:
      [  201.039549] (stack is not available)
      [  201.039551]
      [  201.039555] The buggy address belongs to the object at ffff88076738c1a8
                                       which belongs to the cache kmalloc-2048 of size 2048
      [  201.039559] The buggy address is located 1192 bytes inside of
                                       2048-byte region [ffff88076738c1a8, ffff88076738c9a8)
      [  201.039563] The buggy address belongs to the page:
      [  201.039567] page:ffffea001d9ce200 count:1 mapcount:0 mapping:ffff88084000d0c0 index:0x0 compound_mapcount: 0
      [  201.039573] flags: 0x8000000000008100(slab|head)
      [  201.039578] raw: 8000000000008100 ffffea001da3be08 ffffea001da25a08 ffff88084000d0c0
      [  201.039582] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000
      [  201.039585] page dumped because: kasan: bad access detected
      [  201.039588]
      [  201.039591] Memory state around the buggy address:
      [  201.039594]  ffff88076738c500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  201.039598]  ffff88076738c580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  201.039601] >ffff88076738c600: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc
      [  201.039604]                                                  ^
      [  201.039607]  ffff88076738c680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [  201.039611]  ffff88076738c700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [  201.039613] ==================================================================
      Signed-off-by: NLyude Paul <lyude@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: Karol Herbst <karolherbst@gmail.com>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      37afe55b
    • L
      drm/nouveau: Use drm_connector_list_iter_* for iterating connectors · 22b76bbe
      Lyude Paul 提交于
      Every codepath in nouveau that loops through the connector list
      currently does so using the old method, which is prone to race
      conditions from MST connectors being created and destroyed. This has
      been causing a multitude of problems, including memory corruption from
      trying to access connectors that have already been freed!
      Signed-off-by: NLyude Paul <lyude@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: Karol Herbst <karolherbst@gmail.com>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      22b76bbe
  13. 18 5月, 2018 1 次提交
  14. 27 4月, 2018 1 次提交
  15. 08 12月, 2017 1 次提交
  16. 22 8月, 2017 1 次提交
  17. 24 7月, 2017 1 次提交
  18. 26 6月, 2017 1 次提交
  19. 17 5月, 2017 1 次提交
    • P
      drm/nouveau: Fix drm poll_helper handling · 9a2eba33
      Peter Ujfalusi 提交于
      Commit cae9ff03 effectively disabled the drm poll_helper by checking
      the wrong flag to see if the driver should enable the poll or not:
      mode_config.poll_enabled is only set to true by poll_init and it is not
      indicating if the poll is enabled or not.
      nouveau_display_create() will initialize the poll and going to disable it
      right away. After poll_init() the mode_config.poll_enabled will be true,
      but the poll itself is disabled.
      
      To avoid the race caused by calling the poll_enable() from different paths,
      this patch will enable the poll from one place, in the
      nouveau_display_hpd_work().
      
      In case the pm_runtime is disabled we will enable the poll in
      nouveau_drm_load() once.
      
      Fixes: cae9ff03 ("drm/nouveau: Don't enabling polling twice on runtime resume")
      Signed-off-by: NPeter Ujfalusi <peter.ujfalusi@ti.com>
      Reviewed-by: NLyude <lyude@redhat.com>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      9a2eba33
  20. 10 5月, 2017 3 次提交
    • D
      drm/vblank: drop the mode argument from drm_calc_vbltimestamp_from_scanoutpos · 1bf6ad62
      Daniel Vetter 提交于
      If we restrict this helper to only kms drivers (which is the case) we
      can look up the correct mode easily ourselves. But it's a bit tricky:
      
      - All legacy drivers look at crtc->hwmode. But that is updated already
        at the beginning of the modeset helper, which means when we disable
        a pipe. Hence the final timestamps might be a bit off. But since
        this is an existing bug I'm not going to change it, but just try to
        be bug-for-bug compatible with the current code. This only applies
        to radeon&amdgpu.
      
      - i915 tries to get it perfect by updating crtc->hwmode when the pipe
        is off (i.e. vblank->enabled = false).
      
      - All other atomic drivers look at crtc->state->adjusted_mode. Those
        that look at state->requested_mode simply don't adjust their mode,
        so it's the same. That has two problems: Accessing crtc->state from
        interrupt handling code is unsafe, and it's updated before we shut
        down the pipe. For nonblocking modesets it's even worse.
      
      For atomic drivers try to implement what i915 does. To do that we add
      a new hwmode field to the vblank structure, and update it from
      drm_calc_timestamping_constants(). For atomic drivers that's called
      from the right spot by the helper library already, so all fine. But
      for safety let's enforce that.
      
      For legacy driver this function is only called at the end (oh the
      fun), which is broken, so again let's not bother and just stay
      bug-for-bug compatible.
      
      The  benefit is that we can use drm_calc_vbltimestamp_from_scanoutpos
      directly to implement ->get_vblank_timestamp in every driver, deleting
      a lot of code.
      
      v2: Completely new approach, trying to mimick the i915 solution.
      
      v3: Fixup kerneldoc.
      
      v4: Drop the WARN_ON to check that the vblank is off, atomic helpers
      currently unconditionally call this. Recomputing the same stuff should
      be harmless.
      
      v5: Fix typos and move misplaced hunks to the right patches (Neil).
      
      v6: Undo hunk movement (kbuild).
      
      Cc: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Rob Clark <robdclark@gmail.com>
      Cc: linux-arm-msm@vger.kernel.org
      Cc: freedreno@lists.freedesktop.org
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Reviewed-by: NNeil Armstrong <narmstrong@baylibre.com>
      Acked-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170509140329.24114-4-daniel.vetter@ffwll.ch
      1bf6ad62
    • D
      drm/vblank: Switch to bool in_vblank_irq in get_vblank_timestamp · 3fcdcb27
      Daniel Vetter 提交于
      It's overkill to have a flag parameter which is essentially used just
      as a boolean. This takes care of core + adjusting drivers.
      
      Adjusting the scanout position callback is a bit harder, since radeon
      also supplies it's own driver-private flags in there.
      
      v2: Fixup misplaced hunks (Neil).
      
      v3: kbuild says v1 was better ...
      
      Cc: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Rob Clark <robdclark@gmail.com>
      Cc: linux-arm-msm@vger.kernel.org
      Cc: freedreno@lists.freedesktop.org
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: NNeil Armstrong <narmstrong@baylibre.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170509140329.24114-2-daniel.vetter@ffwll.ch
      3fcdcb27
    • D
      drm/vblank: Switch drm_driver->get_vblank_timestamp to return a bool · d673c02c
      Daniel Vetter 提交于
      There's really no reason for anything more:
      - Calling this while the crtc vblank stuff isn't set up is a driver
        bug. Those places alrready DRM_ERROR.
      - Calling this when the crtc is off is either a driver bug (calling
        drm_crtc_handle_vblank at the wrong time) or a core bug (for
        anything else). Again, we DRM_ERROR.
      - EINVAL is checked at higher levels already, and if we'd use struct
        drm_crtc * instead of (dev, pipe) it would be real obvious that
        those are again core bugs.
      
      The only valid failure mode is crap hardware that couldn't sample a
      useful timestamp, to ask the core to just grab a not-so-accurate
      timestamp. Bool is perfectly fine for that.
      
      v2: Also fix up the one caller, I lost that in the shuffling (Jani).
      
      v3: Fixup commit message (Neil).
      
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
      Cc: Eric Anholt <eric@anholt.net>
      Cc: Rob Clark <robdclark@gmail.com>
      Cc: linux-arm-msm@vger.kernel.org
      Cc: freedreno@lists.freedesktop.org
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Reviewed-by: NNeil Armstrong <narmstrong@baylibre.com>
      Acked-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170509140329.24114-1-daniel.vetter@ffwll.ch
      d673c02c
  21. 29 4月, 2017 1 次提交
    • M
      drm/nouveau/kms: Increase max retries in scanout position queries. · 60b95d70
      Mario Kleiner 提交于
      So far we only allowed for 1 retry and just failed the query
      - and thereby high precision vblank timestamping - if we did
      not get a reasonable result, as such a failure wasn't considered
      all too horrible. There are a few NVidia gpu models out there which
      may need a bit more than 1 retry to get a successful query result
      under some conditions.
      
      Since Linux 4.4 the update code for vblank counter and timestamp
      in drm_update_vblank_count() changed so that the implementation
      assumes that high precision vblank timestamping of a kms driver
      either consistently succeeds or consistently fails for a given
      video mode and encoder/connector combo. Iow. switching from success
      to fail or vice versa on a modeset or connector change is ok, but
      spurious temporary failure for a given setup can confuse the core
      code and potentially cause bad miscounting of vblanks and confusion
      or hangs in userspace clients which rely on vblank  stuff, e.g.,
      desktop compositors.
      
      Therefore change the max retry count to a larger number - more than
      any gpu so far is known to need to succeed, but still low enough
      so that these queries which do also happen in vblank interrupt are
      still fast enough to be not disastrously long if something would
      go badly wrong with them.
      
      As such sporadic retries only happen seldom even on affected gpu's,
      this could mean a vblank irq could take a few dozen microseconds
      longer every few hours of uptime -- better than a desktop compositor
      randomly hanging every couple of hours or days of uptime in a hard
      to reproduce manner.
      Signed-off-by: NMario Kleiner <mario.kleiner.de@gmail.com>
      Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
      60b95d70
  22. 29 3月, 2017 1 次提交
  23. 27 3月, 2017 1 次提交
  24. 28 2月, 2017 1 次提交
  25. 17 2月, 2017 2 次提交
  26. 27 1月, 2017 1 次提交
    • L
      drm/nouveau: Don't enabling polling twice on runtime resume · cae9ff03
      Lyude Paul 提交于
      As it turns out, on cards that actually have CRTCs on them we're already
      calling drm_kms_helper_poll_enable(drm_dev) from
      nouveau_display_resume() before we call it in
      nouveau_pmops_runtime_resume(). This leads us to accidentally trying to
      enable polling twice, which results in a potential deadlock between the
      RPM locks and drm_dev->mode_config.mutex if we end up trying to enable
      polling the second time while output_poll_execute is running and holding
      the mode_config lock. As such, make sure we only enable polling in
      nouveau_pmops_runtime_resume() if we need to.
      
      This fixes hangs observed on the ThinkPad W541
      Signed-off-by: NLyude <lyude@redhat.com>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Cc: Kilian Singer <kilian.singer@quantumtechnology.info>
      Cc: Lukas Wunner <lukas@wunner.de>
      Cc: David Airlie <airlied@redhat.com>
      Signed-off-by: NDave Airlie <airlied@redhat.com>
      cae9ff03
  27. 18 1月, 2017 1 次提交