1. 30 12月, 2022 1 次提交
  2. 26 12月, 2022 1 次提交
    • S
      treewide: Convert del_timer*() to timer_shutdown*() · 292a089d
      Steven Rostedt (Google) 提交于
      Due to several bugs caused by timers being re-armed after they are
      shutdown and just before they are freed, a new state of timers was added
      called "shutdown".  After a timer is set to this state, then it can no
      longer be re-armed.
      
      The following script was run to find all the trivial locations where
      del_timer() or del_timer_sync() is called in the same function that the
      object holding the timer is freed.  It also ignores any locations where
      the timer->function is modified between the del_timer*() and the free(),
      as that is not considered a "trivial" case.
      
      This was created by using a coccinelle script and the following
      commands:
      
          $ cat timer.cocci
          @@
          expression ptr, slab;
          identifier timer, rfield;
          @@
          (
          -       del_timer(&ptr->timer);
          +       timer_shutdown(&ptr->timer);
          |
          -       del_timer_sync(&ptr->timer);
          +       timer_shutdown_sync(&ptr->timer);
          )
            ... when strict
                when != ptr->timer
          (
                  kfree_rcu(ptr, rfield);
          |
                  kmem_cache_free(slab, ptr);
          |
                  kfree(ptr);
          )
      
          $ spatch timer.cocci . > /tmp/t.patch
          $ patch -p1 < /tmp/t.patch
      
      Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/Signed-off-by: NSteven Rostedt (Google) <rostedt@goodmis.org>
      Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
      Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
      Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      292a089d
  3. 22 12月, 2022 2 次提交
  4. 21 12月, 2022 7 次提交
    • T
      drm/amdgpu: skip mes self test after s0i3 resume for MES IP v11.0 · 8660495a
      Tim Huang 提交于
      MES is part of gfxoff and MES suspend and resume are skipped for S0i3.
      But the mes_self_test call path is still in the amdgpu_device_ip_late_init.
      it's should also be skipped for s0ix as no hardware re-initialization
      happened.
      
      Besides, mes_self_test will free the BO that triggers a lot of warning
      messages while in the suspend state.
      
      [   81.656085] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:425 amdgpu_bo_free_kernel+0xfc/0x110 [amdgpu]
      [   81.679435] Call Trace:
      [   81.679726]  <TASK>
      [   81.679981]  amdgpu_mes_remove_hw_queue+0x17a/0x230 [amdgpu]
      [   81.680857]  amdgpu_mes_self_test+0x390/0x430 [amdgpu]
      [   81.681665]  mes_v11_0_late_init+0x37/0x50 [amdgpu]
      [   81.682423]  amdgpu_device_ip_late_init+0x53/0x280 [amdgpu]
      [   81.683257]  amdgpu_device_resume+0xae/0x2a0 [amdgpu]
      [   81.684043]  amdgpu_pmops_resume+0x37/0x70 [amdgpu]
      [   81.684818]  pci_pm_resume+0x5c/0xa0
      [   81.685247]  ? pci_pm_thaw+0x90/0x90
      [   81.685658]  dpm_run_callback+0x4e/0x160
      [   81.686110]  device_resume+0xad/0x210
      [   81.686529]  async_resume+0x1e/0x40
      [   81.686931]  async_run_entry_fn+0x33/0x120
      [   81.687405]  process_one_work+0x21d/0x3f0
      [   81.687869]  worker_thread+0x4a/0x3c0
      [   81.688293]  ? process_one_work+0x3f0/0x3f0
      [   81.688777]  kthread+0xff/0x130
      [   81.689157]  ? kthread_complete_and_exit+0x20/0x20
      [   81.689707]  ret_from_fork+0x22/0x30
      [   81.690118]  </TASK>
      [   81.690380] ---[ end trace 0000000000000000 ]---
      
      v2: make the comment clean and use adev->in_s0ix instead of
      adev->suspend
      Signed-off-by: NTim Huang <tim.huang@amd.com>
      Reviewed-by: NMario Limonciello <mario.limonciello@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org # 6.0, 6.1
      8660495a
    • E
      drm/amd/pm: correct the fan speed retrieving in PWM for some SMU13 asics · e73fc71e
      Evan Quan 提交于
      For SMU 13.0.0 and 13.0.7, the output from PMFW is in percent. Driver
      need to convert that into correct PMW(255) based.
      Signed-off-by: NEvan Quan <evan.quan@amd.com>
      Acked-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org # 6.0, 6.1
      e73fc71e
    • E
      drm/amd/pm: bump SMU13.0.0 driver_if header to version 0x34 · 272b9814
      Evan Quan 提交于
      To fit the latest PMFW and suppress the warning emerged on driver loading.
      Signed-off-by: NEvan Quan <evan.quan@amd.com>
      Acked-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org # 6.0, 6.1
      272b9814
    • A
      drm/amdgpu: skip MES for S0ix as well since it's part of GFX · afa6646b
      Alex Deucher 提交于
      It's also part of gfxoff.
      
      Cc: stable@vger.kernel.org # 6.0, 6.1
      Reviewed-by: NMario Limonciello <mario.limonciello@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      afa6646b
    • A
      drm/amd/pm: avoid large variable on kernel stack · d118b18f
      Arnd Bergmann 提交于
      The activity_monitor_external[] array is too big to fit on the
      kernel stack, resulting in this warning with clang:
      
      drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_7_ppt.c:1438:12: error: stack frame size (1040) exceeds limit (1024) in 'smu_v13_0_7_get_power_profile_mode' [-Werror,-Wframe-larger-than]
      
      Use dynamic allocation instead. It should also be possible to
      have single element here instead of the array, but this seems
      easier.
      
      v2: fix up argument to sizeof() (Alex)
      
      Fixes: 334682ae ("drm/amd/pm: enable workload type change on smu_v13_0_7")
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      d118b18f
    • P
      drm/amdkfd: Fix double release compute pasid · 1a799c4c
      Philip Yang 提交于
      If kfd_process_device_init_vm returns failure after vm is converted to
      compute vm and vm->pasid set to compute pasid, KFD will not take
      pdd->drm_file reference. As a result, drm close file handler maybe
      called to release the compute pasid before KFD process destroy worker to
      release the same pasid and set vm->pasid to zero, this generates below
      WARNING backtrace and NULL pointer access.
      
      Add helper amdgpu_amdkfd_gpuvm_set_vm_pasid and call it at the last step
      of kfd_process_device_init_vm, to ensure vm pasid is the original pasid
      if acquiring vm failed or is the compute pasid with pdd->drm_file
      reference taken to avoid double release same pasid.
      
       amdgpu: Failed to create process VM object
       ida_free called for id=32770 which is not allocated.
       WARNING: CPU: 57 PID: 72542 at ../lib/idr.c:522 ida_free+0x96/0x140
       RIP: 0010:ida_free+0x96/0x140
       Call Trace:
        amdgpu_pasid_free_delayed+0xe1/0x2a0 [amdgpu]
        amdgpu_driver_postclose_kms+0x2d8/0x340 [amdgpu]
        drm_file_free.part.13+0x216/0x270 [drm]
        drm_close_helper.isra.14+0x60/0x70 [drm]
        drm_release+0x6e/0xf0 [drm]
        __fput+0xcc/0x280
        ____fput+0xe/0x20
        task_work_run+0x96/0xc0
        do_exit+0x3d0/0xc10
      
       BUG: kernel NULL pointer dereference, address: 0000000000000000
       RIP: 0010:ida_free+0x76/0x140
       Call Trace:
        amdgpu_pasid_free_delayed+0xe1/0x2a0 [amdgpu]
        amdgpu_driver_postclose_kms+0x2d8/0x340 [amdgpu]
        drm_file_free.part.13+0x216/0x270 [drm]
        drm_close_helper.isra.14+0x60/0x70 [drm]
        drm_release+0x6e/0xf0 [drm]
        __fput+0xcc/0x280
        ____fput+0xe/0x20
        task_work_run+0x96/0xc0
        do_exit+0x3d0/0xc10
      Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
      Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      1a799c4c
    • P
      drm/amdkfd: Fix kfd_process_device_init_vm error handling · 29d48b87
      Philip Yang 提交于
      Should only destroy the ib_mem and let process cleanup worker to free
      the outstanding BOs. Reset the pointer in pdd->qpd structure, to avoid
      NULL pointer access in process destroy worker.
      
       BUG: kernel NULL pointer dereference, address: 0000000000000010
       Call Trace:
        amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel+0x46/0xb0 [amdgpu]
        kfd_process_device_destroy_cwsr_dgpu+0x40/0x70 [amdgpu]
        kfd_process_destroy_pdds+0x71/0x190 [amdgpu]
        kfd_process_wq_release+0x2a2/0x3b0 [amdgpu]
        process_one_work+0x2a1/0x600
        worker_thread+0x39/0x3d0
      Signed-off-by: NPhilip Yang <Philip.Yang@amd.com>
      Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      29d48b87
  5. 16 12月, 2022 4 次提交
  6. 15 12月, 2022 6 次提交
  7. 14 12月, 2022 14 次提交
  8. 10 12月, 2022 5 次提交