- 20 8月, 2019 3 次提交
-
-
由 Jason Gunthorpe 提交于
The sequence of mmu_notifier_unregister_no_release(), mmu_notifier_call_srcu() is identical to mmu_notifier_put() with the free_notifier callback. As this is the last user of those APIs, converting it means we can drop them. Link: https://lore.kernel.org/r/20190806231548.25242-11-jgg@ziepe.caReviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
When using mmu_notifer_unregister_no_release() the caller must ensure there is a SRCU synchronize before the mn memory is freed, otherwise use after free races are possible, for instance: CPU0 CPU1 invalidate_range_start hlist_for_each_entry_rcu(..) mmu_notifier_unregister_no_release(&p->mn) kfree(mn) if (mn->ops->invalidate_range_end) The error unwind in amdkfd misses the SRCU synchronization. amdkfd keeps the kfd_process around until the mm is released, so split the flow to fully initialize the kfd_process and register it for find_process, and with the notifier. Past this point the kfd_process does not need to be cleaned up as it is fully ready. The final failable step does a vm_mmap() and does not seem to impact the kfd_process global state. Since it also cannot be undone (and already has problems with undo if it internally fails), it has to be last. This way we don't have to try to unwind the mmu_notifier_register() and avoid the problem with the SRCU. Along the way this also fixes various other error unwind bugs in the flow. Fixes: 45102048 ("amdkfd: Add process queue manager module") Link: https://lore.kernel.org/r/20190806231548.25242-10-jgg@ziepe.caReviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Jason Gunthorpe 提交于
This is a significant simplification, it eliminates all the remaining 'hmm' stuff in mm_struct, eliminates krefing along the critical notifier paths, and takes away all the ugly locking and abuse of page_table_lock. mmu_notifier_get() provides the single struct hmm per struct mm which eliminates mm->hmm. It also directly guarantees that no mmu_notifier op callback is callable while concurrent free is possible, this eliminates all the krefs inside the mmu_notifier callbacks. The remaining krefs in the range code were overly cautious, drivers are already not permitted to free the mirror while a range exists. Link: https://lore.kernel.org/r/20190806231548.25242-6-jgg@ziepe.caReviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NRalph Campbell <rcampbell@nvidia.com> Tested-by: NRalph Campbell <rcampbell@nvidia.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 08 8月, 2019 5 次提交
-
-
由 Christoph Hellwig 提交于
Make HMM_MIRROR an option that is selected by drivers wanting to use it instead of a user visible option as it is just a low-level implementation detail. Link: https://lore.kernel.org/r/20190806160554.14046-15-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Christoph Hellwig 提交于
All users pass PAGE_SIZE here, and if we wanted to support single entries for huge pages we should really just add a HMM_FAULT_HUGEPAGE flag instead that uses the huge page size instead of having the caller calculate that size once, just for the hmm code to verify it. Link: https://lore.kernel.org/r/20190806160554.14046-8-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Christoph Hellwig 提交于
The start, end and page_shift values are all saved in the range structure, so we might as well use that for argument passing. Link: https://lore.kernel.org/r/20190806160554.14046-7-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Christoph Hellwig 提交于
The list is used to add the range to another list as an entry in the core hmm code, and intended as a private member not exposed to drivers. There is no need to initialize it in a driver. Link: https://lore.kernel.org/r/20190806160554.14046-3-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Christoph Hellwig 提交于
hmm_range_fault can only return -EAGAIN if called with the HMM_FAULT_ALLOW_RETRY flag, which amdgpu never does. Remove the handling for the -EAGAIN case with its non-standard locking scheme. Link: https://lore.kernel.org/r/20190806160554.14046-2-hch@lst.deSigned-off-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 26 7月, 2019 2 次提交
-
-
由 Christoph Hellwig 提交于
This allows easier expansion to other flags, and also makes the callers a little easier to read. Link: https://lore.kernel.org/r/20190726005650.2566-4-rcampbell@nvidia.comSigned-off-by: NChristoph Hellwig <hch@lst.de> Signed-off-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
由 Ralph Campbell 提交于
The hmm_mirror_ops callback function sync_cpu_device_pagetables() passes a struct hmm_update which is a simplified version of struct mmu_notifier_range. This is unnecessary so replace hmm_update with mmu_notifier_range directly. Link: https://lore.kernel.org/r/20190726005650.2566-2-rcampbell@nvidia.comSigned-off-by: NRalph Campbell <rcampbell@nvidia.com> Reviewed: Christoph Hellwig <hch@lst.de> Reviewed-by: NJason Gunthorpe <jgg@mellanox.com> [jgg: white space tuning] Signed-off-by: NJason Gunthorpe <jgg@mellanox.com>
-
- 19 7月, 2019 3 次提交
-
-
由 hersen wu 提交于
[WHY] dc sw clock implementation of navi10 and raven are not exact the same. dcccg, dchub reference clock initialization is done after dc calls vbios dispcontroller_init table. for raven family, before dispcontroller_init is called by dc, the ref clk values are referred by sw clock implementation and program asic register using wrong values. this causes dchub pstate error. This need provide valid ref clk values. for navi10, since dispcontroller_init is not called, dchubbub_global_timer_enable = 0, hubbub2_get_dchub_ref_freq will hit aeert. this need remove hubbub2_get_dchub_ref_freq from this location and move to dcn20_init_hw. [HOW] for all asic, initialize dccg, dchub ref clk with data from vbios firmware table by default. for raven asic family, use these data from vbios, for asic which support sw dccg component, like navi10, read ref clk by sw dccg functions and update the ref clk. Signed-off-by: Nhersen wu <hersenxs.wu@amd.com> Reviewed-by: NJun Lei <Jun.Lei@amd.com> Acked-by: NLeo Li <sunpeng.li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
The dpm sensor function already does this for us. This fixes the freq*_input files with the new SMU implementation. Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nicholas Kazlauskas 提交于
Workaround for now to avoid underflow. The uclk switch time should really be bumped up to 404, but doing so would expose p-state hang issues for higher bandwidth display configurations. Signed-off-by: NNicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: NLeo Li <sunpeng.li@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 18 7月, 2019 10 次提交
-
-
由 Joseph Greathouse 提交于
If we shut down a process without having destroyed its GWS-using queues, it is possible that GWS BO will still be in the process BO list during the gpuvm destruction. This list should be empty at that time, so we should remove the GWS allocation at the process uninit point if it is still around. Signed-off-by: NJoseph Greathouse <Joseph.Greathouse@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tom St Denis 提交于
The register debugfs interface was using the wrong bitmask for vmid selection for GFX_CNTL. Signed-off-by: NTom St Denis <tom.stdenis@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
Optimization for the socket power calculation is introduced. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NKenneth Feng <kenneth.feng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
Do not halt driver loading on if_version mismatch. As our driver and FWs are backward compatible. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NKenneth Feng <kenneth.feng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
The interface was used in a confusing way. In profile mode scenario, the 2nd parameter of the interface was used in a different way from other scenarios. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
As the lock was already held on the entrance to smu_handle_task. - V2: lock in small granularity Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NKevin Wang <kevin1.wang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
No access before allocation. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NFeifei Xu <Feifei.Xu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
Fix memory allocation failure check. - V2: fix one more similar error Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
Under memory pressure, buffer moves between RAM to VRAM can fail when there is no GTT space available. In those cases amdgpu_bo_move falls back to ttm_bo_move_memcpy, which seems to succeed, although it doesn't really support non-contiguous or invisible VRAM. This manifests as VM faults with corrupted page table entries in KFD eviction stress tests. Print some helpful messages when lack of GTT space is causing buffer moves to fail. Check that source and destination memory regions are supported by ttm_bo_move_memcpy before taking that fallback. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Not used anymore. Reviewed-by: NEvan Quan <evan.quan@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Noticed-by: NDave Airlie <airlied@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 17 7月, 2019 17 次提交
-
-
由 Nathan Chancellor 提交于
clang warns: drivers/gpu/drm/amd/amdgpu/../powerplay/vega20_ppt.c:995:39: warning: implicit conversion from enumeration type 'PPCLK_e' to different enumeration type 'enum smu_clk_type' [-Wenum-conversion] ret = smu_get_current_clk_freq(smu, PPCLK_SOCCLK, &now); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../powerplay/vega20_ppt.c:1016:39: warning: implicit conversion from enumeration type 'PPCLK_e' to different enumeration type 'enum smu_clk_type' [-Wenum-conversion] ret = smu_get_current_clk_freq(smu, PPCLK_FCLK, &now); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../powerplay/vega20_ppt.c:1031:39: warning: implicit conversion from enumeration type 'PPCLK_e' to different enumeration type 'enum smu_clk_type' [-Wenum-conversion] ret = smu_get_current_clk_freq(smu, PPCLK_DCEFCLK, &now); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~ The values are mapped one to one in vega20_get_smu_clk_index so just use the proper enums here. Fixes: 09676101 ("drm/amd/powerplay: support sysfs to get socclk, fclk, dcefclk") Link: https://github.com/ClangBuiltLinux/linux/issues/587Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NNathan Chancellor <natechancellor@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nicolai Hähnle 提交于
Prefetch mode 0 is not supported and can lead to hangs with certain very specific code patterns. Set a sound prefetch mode for all VMIDs rather than forcing all shaders to set the prefetch mode at the beginning. Reduce code duplication a bit while we're at it. Note that the 64-bit address mode enum and the retry all enum are both 0, so the only functional change is in the INITIAL_INST_PREFETCH field. Signed-off-by: NNicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: NMarek Olšák <marek.olsak@amd.com> Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kenneth Feng 提交于
enable fw ctf, apcc dfll and gfx ss on navi10. fw ctf: when the fw ctf is triggered, the gfx and soc power domain are shut down. fan speed is boosted to the maximum. gfx ss: hardware feature, sanity check has been done. apcc dfll: can check the scoreboard in smu fw to confirm if it's enabled. no need to do further check since the gfx hardware control the frequency once a pcc signal comes. Signed-off-by: NKenneth Feng <kenneth.feng@amd.com> Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Hawking Zhang 提交于
The legacy navi10 sos binary will not carry on kdb image. the kdb_start_addr is actually the start address of sys_drv image and shouldn't be sent to psp bootloader. Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
When starting a new mm_node, the page_offset becomes 0. Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Wang Xiayang 提交于
The simple_strtol() function is deprecated. kstrto[l,u32]() is the correct replacement as it can properly handle overflows. This patch replaces the deprecated simple_strtol() use introduced recently. As clk is of type uint32_t, we are safe to use kstrtou32(). It is also safe to return zero on string parsing error, similar to the case of returning zero if buf is empty in parse_clk(). Fixes: bb5a2bdf ("drm/amdgpu: support dpm level modification under virtualization v3") Signed-off-by: NWang Xiayang <xywang.sjtu@sjtu.edu.cn> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Fuqian Huang 提交于
kzalloc has already zeroed the memory during the allocation. So memset is unneeded. Reviewed-by: NEmil Velikov <emil.velikov@collabora.com> Signed-off-by: NFuqian Huang <huangfq.daxian@gmail.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Arnd Bergmann 提交于
It is annoying to have #warnings that trigger in randconfig builds like drivers/gpu/drm/amd/amdgpu/soc15.c:653:3: error: "Enable CONFIG_DRM_AMD_DC for display support on SOC15." drivers/gpu/drm/amd/amdgpu/nv.c:400:3: error: "Enable CONFIG_DRM_AMD_DC for display support on navi." Remove these and rely on the users to turn these on. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kent Russell 提交于
The perf counter for Vega20 is 108, instead of 104 which it was on all previous GPUs, so add a check to use the appropriate value. Signed-off-by: NKent Russell <kent.russell@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tom St Denis 提交于
The ability to select GFX GRBM me/pipe/queue/vmid was missing from the gfx10 driver. This patch adds it. Used by the debugfs register interface to select GFX resources when read/writing registers. Signed-off-by: NTom St Denis <tom.stdenis@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Tom St Denis 提交于
Add 5 bits to the offset for SRBM selection to handle VMIDs. Also update the select_me_pipe_q() callback to also select VMID. Signed-off-by: NTom St Denis <tom.stdenis@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kevin Wang 提交于
v2: change function name to smu_clk_dpm_is_enabled. add this helper function to check dpm clk feature is enabled. Signed-off-by: NKevin Wang <kevin1.wang@amd.com> Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kevin Wang 提交于
the save dpm level should be save previous dpm profile level, should not modified by get dpm level function. eg: default auto 1. auto -> standard ==> dpm_level = standard, save_dpm = auto. 2. standard -> auto ==> dpm_level = auto, save_dpm = standard. Signed-off-by: NKevin Wang <kevin1.wang@amd.com> Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kevin Wang 提交于
the unforce_dpm_levels doesn't need to check feature enablement. because the smu_get_dpm_freq_range function has check feature logic. Signed-off-by: NKevin Wang <kevin1.wang@amd.com> Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kevin Wang 提交于
1. the standard dpm is not support before. 2. use auto profile to adapt standard profile. Signed-off-by: NKevin Wang <kevin1.wang@amd.com> Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kevin Wang 提交于
1.miss socclk profile support when bringup. 2.add feature check for socclk. Signed-off-by: NKevin Wang <kevin1.wang@amd.com> Reviewed-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Felix Kuehling 提交于
Apply the same setting to SH_MEM_CONFIG and VM_CONTEXT1_CNTL. This makes the noretry param no longer KFD-specific. On GFX10 I'm not changing SH_MEM_CONFIG in this commit because GFX10 has different retry behaviour in the SQ and I don't have a way to test it at the moment. Suggested-by: NChristian König <Christian.Koenig@amd.com> CC: Philip Yang <Philip.Yang@amd.com> Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com> Reviewed-by : Shaoyun.liu < Shaoyun.liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-