1. 05 3月, 2020 8 次提交
  2. 29 2月, 2020 11 次提交
    • M
      drm/amdgpu/smu: Add message sending lock · eb696d04
      Matt Coffin 提交于
      This adds a message lock to the smu_send_smc_msg* implementations to
      protect against concurrent access to the mmu registers used to
      communicate with the SMU
      
      v2: Implement for smu_v12_0 as well
      
      v3: Add mutex_init for message_lock
      Signed-off-by: NMatt Coffin <mcoffin13@gmail.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      eb696d04
    • M
      drm/amdgpu/powerplay: Remove deprecated smc_read_arg · ae458c7b
      Matt Coffin 提交于
      The new interface reads the argument in the call to send the message, so
      this is no longer needed, and shouldn't be used for concurrency safety
      reasons.
      Signed-off-by: NMatt Coffin <mcoffin13@gmail.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      ae458c7b
    • M
      drm/amdgpu/powerplay: Refactor SMU message handling for safety · 1c58267c
      Matt Coffin 提交于
      Move the responsibility for reading argument registers into the
      smu_send_smc_msg* implementations, so that adding a message-sending lock
      to protect the SMU registers will result in the lock still being held
      when the argument is read.
      
      v2: transition smu_v12_0, it's asics, and vega20
      Signed-off-by: NMatt Coffin <mcoffin13@gmail.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      1c58267c
    • H
      drm/amdgpu/powerplay: nv1x, renior copy dcn clock settings of watermark to smu during boot up · 2622e2ae
      Hersen Wu 提交于
      dc to pplib interface is changed for navi1x, renoir.
      display_config_changed is not called by dc anymore.
      smu_write_watermarks_table is not executed for navi1x, renoir
      during boot up.
      
      solution: call smu_write_watermarks_table just after dc pass
      watermark clock settings to pplib
      Signed-off-by: NHersen Wu <hersenxs.wu@amd.com>
      Reviewed-by: NEvan Quan <evan.quan@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      2622e2ae
    • Y
      drm/amdgpu: release drm_device after amdgpu_driver_unload_kms · 6c26d558
      Yintian Tao 提交于
      If we release drm_device before amdgpu_driver_unload_kms,
      then it will raise the error below. Therefore, we need to
      place it before amdgpu_driver_unload_kms.
      [   43.055736] Memory manager not clean during takedown.
      [   43.055777] WARNING: CPU: 1 PID: 2807 at /build/linux-hwe-9KJ07q/linux-hwe-4.18.0/drivers/gpu/drm/drm_mm.c:913 drm_mm_takedown+0x24/0x30 [drm]
      [   43.055778] Modules linked in: amdgpu(OE-) amd_sched(OE) amdttm(OE) amdkcl(OE) amd_iommu_v2 drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt snd_hda_codec_generic nfit kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm ghash_clmulni_intel snd_seq_midi snd_seq_midi_event pcbc snd_rawmidi snd_seq snd_seq_device aesni_intel snd_timer joydev aes_x86_64 crypto_simd cryptd glue_helper snd soundcore input_leds mac_hid serio_raw qemu_fw_cfg binfmt_misc sch_fq_codel nfsd auth_rpcgss nfs_acl lockd grace sunrpc parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic floppy usbhid psmouse hid i2c_piix4 e1000 pata_acpi
      [   43.055819] CPU: 1 PID: 2807 Comm: modprobe Tainted: G           OE     4.18.0-15-generic #16~18.04.1-Ubuntu
      [   43.055820] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
      [   43.055830] RIP: 0010:drm_mm_takedown+0x24/0x30 [drm]
      [   43.055831] Code: 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 38 48 83 c7 38 48 39 c7 75 02 f3 c3 55 48 c7 c7 38 33 80 c0 48 89 e5 e8 1c 41 ec d0 <0f> 0b 5d c3 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
      [   43.055857] RSP: 0018:ffffae33c1393d28 EFLAGS: 00010286
      [   43.055859] RAX: 0000000000000000 RBX: ffff9651b4a29800 RCX: 0000000000000006
      [   43.055860] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff9651bfc964b0
      [   43.055861] RBP: ffffae33c1393d28 R08: 00000000000002a6 R09: 0000000000000004
      [   43.055861] R10: ffffae33c1393d20 R11: 0000000000000001 R12: ffff9651ba6cb000
      [   43.055863] R13: ffff9651b7f40000 R14: ffffffffc0de3a10 R15: ffff9651ba5c6460
      [   43.055864] FS:  00007f1d3c08d540(0000) GS:ffff9651bfc80000(0000) knlGS:0000000000000000
      [   43.055865] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   43.055866] CR2: 00005630a5831640 CR3: 000000012e274004 CR4: 00000000003606e0
      [   43.055870] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   43.055871] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   43.055871] Call Trace:
      [   43.055885]  drm_vma_offset_manager_destroy+0x1b/0x30 [drm]
      [   43.055894]  drm_gem_destroy+0x19/0x40 [drm]
      [   43.055903]  drm_dev_fini+0x7f/0x90 [drm]
      [   43.055911]  drm_dev_release+0x2b/0x40 [drm]
      [   43.055919]  drm_dev_unplug+0x64/0x80 [drm]
      [   43.055994]  amdgpu_pci_remove+0x39/0x70 [amdgpu]
      [   43.055998]  pci_device_remove+0x3e/0xc0
      [   43.056001]  device_release_driver_internal+0x18a/0x260
      [   43.056003]  driver_detach+0x3f/0x80
      [   43.056004]  bus_remove_driver+0x59/0xd0
      [   43.056006]  driver_unregister+0x2c/0x40
      [   43.056008]  pci_unregister_driver+0x22/0xa0
      [   43.056087]  amdgpu_exit+0x15/0x57c [amdgpu]
      [   43.056090]  __x64_sys_delete_module+0x146/0x280
      [   43.056094]  do_syscall_64+0x5a/0x120
      
      v2: put drm_dev_put after pci_set_drvdata
      Signed-off-by: NYintian Tao <yttao@amd.com>
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      6c26d558
    • Y
      drm/amdgpu: no need to clean debugfs at amdgpu · d2790e10
      Yintian Tao 提交于
      drm_minor_unregister will invoke drm_debugfs_cleanup
      to clean all the child node under primary minor node.
      We don't need to invoke amdgpu_debugfs_fini and
      amdgpu_debugfs_regs_cleanup to clean agian.
      Otherwise, it will raise the NULL pointer like below.
      [   45.046029] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
      [   45.047256] PGD 0 P4D 0
      [   45.047713] Oops: 0002 [#1] SMP PTI
      [   45.048198] CPU: 0 PID: 2796 Comm: modprobe Tainted: G        W  OE     4.18.0-15-generic #16~18.04.1-Ubuntu
      [   45.049538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
      [   45.050651] RIP: 0010:down_write+0x1f/0x40
      [   45.051194] Code: 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8 ce d9 ff ff 48 ba 01 00 00 00 ff ff ff ff 48 89 d8 <f0> 48 0f c1 10 85 d2 74 05 e8 53 1c ff ff 65 48 8b 04 25 00 5c 01
      [   45.053702] RSP: 0018:ffffad8f4133fd40 EFLAGS: 00010246
      [   45.054384] RAX: 00000000000000a8 RBX: 00000000000000a8 RCX: ffffa011327dd814
      [   45.055349] RDX: ffffffff00000001 RSI: 0000000000000001 RDI: 00000000000000a8
      [   45.056346] RBP: ffffad8f4133fd48 R08: 0000000000000000 R09: ffffffffc0690a00
      [   45.057326] R10: ffffad8f4133fd58 R11: 0000000000000001 R12: ffffa0113cff0300
      [   45.058266] R13: ffffa0113c0a0000 R14: ffffffffc0c02a10 R15: ffffa0113e5c7860
      [   45.059221] FS:  00007f60d46f9540(0000) GS:ffffa0113fc00000(0000) knlGS:0000000000000000
      [   45.060809] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   45.061826] CR2: 00000000000000a8 CR3: 0000000136250004 CR4: 00000000003606f0
      [   45.062913] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   45.064404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   45.065897] Call Trace:
      [   45.066426]  debugfs_remove+0x36/0xa0
      [   45.067131]  amdgpu_debugfs_ring_fini+0x15/0x20 [amdgpu]
      [   45.068019]  amdgpu_debugfs_fini+0x2c/0x50 [amdgpu]
      [   45.068756]  amdgpu_pci_remove+0x49/0x70 [amdgpu]
      [   45.069439]  pci_device_remove+0x3e/0xc0
      [   45.070037]  device_release_driver_internal+0x18a/0x260
      [   45.070842]  driver_detach+0x3f/0x80
      [   45.071325]  bus_remove_driver+0x59/0xd0
      [   45.071850]  driver_unregister+0x2c/0x40
      [   45.072377]  pci_unregister_driver+0x22/0xa0
      [   45.073043]  amdgpu_exit+0x15/0x57c [amdgpu]
      [   45.073683]  __x64_sys_delete_module+0x146/0x280
      [   45.074369]  do_syscall_64+0x5a/0x120
      [   45.074916]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      v2: remove all debugfs cleanup/fini code at amdgpu
      v3: squash in unused variable removal
      Signed-off-by: NYintian Tao <yttao@amd.com>
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      d2790e10
    • J
      drm/amdgpu: Initialize SPM_VMID with 0xf (v2) · 460c484f
      Jacob He 提交于
      SPM_VMID is a global resource, SPM access the video memory according to
      SPM_VMID. The initial valude of SPM_VMID is 0 which is used by kernel.
      That means UMD can overwrite the memory of VMID0 by enabling SPM, that
      is really dangerous.
      
      Initialize SPM_VMID with 0xf, it messes up other user mode process at
      most.
      
      v2: squash in indentation fix
      Signed-off-by: NJacob He <jacob.he@amd.com>
      Reviewed-by: NChristian König <christian.koenig@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      460c484f
    • E
      drm/amdgpu/sriov: Use kiq to copy the gpu clock · 89510a27
      Emily Deng 提交于
      For vega10 sriov, the register is blocked, use
      copy data command to fix the issue.
      
      v2: Rename amdgpu_kiq_read_clock to gfx_v9_0_kiq_read_clock.
      Signed-off-by: NEmily Deng <Emily.Deng@amd.com>
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      89510a27
    • E
      drm/amdkfd: change SDMA MQD memory type · f2cc50ce
      Eric Huang 提交于
      SDMA MQD memory type is NC that causes MQD data overwritten
      accidentally by an old stable cache line. Changing it to UC
      default for GART will fix the issue.
      
      The mqd_gfx9 parameter is meant for control stacks that are
      allocated together with user mode queue MQDs. Setting
      mqd_gfx9 to true maps the control stack pages as NC.
      Here it was accidentally applied to SDMA MQDs,
      which are allocated together with the HIQ MQD. Setting
      the mqd_gfx9 to false avoids that.
      Signed-off-by: NEric Huang <jinhuieric.huang@amd.com>
      Acked-by: NYong Zhao <Yong.Zhao@amd.com>
      Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      f2cc50ce
    • Y
      drm/amdkfd: Make get_tile_config() generic · fd7d08ba
      Yong Zhao 提交于
      Given we can query all the asic specific information from amdgpu_gfx_config,
      we can make get_tile_config() generic.
      Signed-off-by: NYong Zhao <Yong.Zhao@amd.com>
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      fd7d08ba
    • Y
      drm/amdgpu: Add num_banks and num_ranks to gfx config structure · 94b5c215
      Yong Zhao 提交于
      The two members will be used by KFD later.
      Signed-off-by: NYong Zhao <Yong.Zhao@amd.com>
      Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
      94b5c215
  3. 28 2月, 2020 2 次提交
  4. 27 2月, 2020 19 次提交