- 14 4月, 2020 2 次提交
-
-
由 Aurabindo Pillai 提交于
Let format prefixes take care of printing the module name through pr_fmt and dev_fmt definitions. Signed-off-by: NAurabindo Pillai <mail@aurabindo.in> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
Vram lost counter is wrongly increased by two during baco reset. V2: assumed vram lost for mode1 reset on all ASICs Signed-off-by: NEvan Quan <evan.quan@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 09 4月, 2020 9 次提交
-
-
由 Hawking Zhang 提交于
add indirect access support to registers outside of mmio bar. Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Hawking Zhang 提交于
all the register access through kiq is redirected to amdgpu_kiq_rreg/amdgpu_kiq_wreg Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Hawking Zhang 提交于
those are not needed anymore Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Hawking Zhang 提交于
the workaround is not needed for soc15 ASICs except for vega10. it is even not needed with latest vega10 vbios. Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Prike Liang 提交于
The system will be hang up during S3 suspend because of SMU is pending for GC not respose the register CP_HQD_ACTIVE access request.This issue root cause of accessing the GC register under enter GFX CGGPG and can be fixed by disable GFX CGPG before perform suspend. v2: Use disable the GFX CGPG instead of RLC safe mode guard. Signed-off-by: NPrike Liang <Prike.Liang@amd.com> Tested-by: NMengbing Wang <Mengbing.Wang@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jack Zhang 提交于
[PATCH 2/2] kfd_pre_reset will free mem_objs allocated by kfd_gtt_sa_allocate Without this change, sriov tdr code path will never free those allocated memories and get memory leak. Signed-off-by: NJack Zhang <Jack.Zhang1@amd.com> Reviewed-by: NMonk Liu <monk.liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jack Zhang 提交于
This reverts commit 5161bba4311f in order to split it into two different patches, and this will make it easier to understand. [PATCH 1/2] porting to gfx10 from commit 1b0bfcff ("drm/amdgpu: Avoid destroy hqd when GPU is on reset") Originally, MEC is touched without GPU initialized first. Signed-off-by: NJack Zhang <Jack.Zhang1@amd.com> Reviewed-by: NMonk Liu <monk.liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Nirmoy Das 提交于
Generate HW IP's sched_list in amdgpu_ring_init() instead of amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(), ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary. This patch also stores sched_list for all HW IPs in one big array in struct amdgpu_device which makes amdgpu_ctx_init_entity() much more leaner. v2: fix a coding style issue do not use drm hw_ip const to populate amdgpu_ring_type enum v3: remove ctx reference and move sched array and num_sched to a struct use num_scheds to detect uninitialized scheduler list v4: use array_index_nospec for user space controlled variables fix possible checkpatch.pl warnings Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jack Zhang 提交于
kfd_pre_reset will free mem_objs allocated by kfd_gtt_sa_allocate Without this change, sriov tdr code path will never free those allocated memories and get memory leak. v2:add a bugfix for kiq ring test fail Signed-off-by: NJack Zhang <Jack.Zhang1@amd.com> Reviewed-by: NMonk Liu <monk.liu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 02 4月, 2020 7 次提交
-
-
由 Jiawei 提交于
extend compute lockup timeout to 60000 for SR-IOV. Reviewed-by: NEmily Deng <Emily.Deng@amd.com> Signed-off-by: NJiawei <Jiawei.Gu@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
if host support new handshake we only need to enter fullaccess_mode in ip_init() part, otherwise we need to do it before reading vbios (becuase host prepares vbios for VF only after received REQ_GPU_INIT event under legacy handshake) Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NEmily Deng <Emily.Deng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
what: 1)move timtout setting before ip_early_init to reduce exclusive mode cost for SRIOV 2)move ip_discovery_init() to inside of amdgpu_discovery_reg_base_init() it is a prepare for the later upcoming patches. why: in later upcoming patches we would use a new mailbox event -- "req_gpu_init_data", which is a callback hooked in adev->virt.ops and this callback send a new event "REQ_GPU_INIT_DAT" to host to notify host to do some preparation like "IP discovery/vbios on the VF FB" and this callback must be: A) invoked after set_ip_block() because virt.ops is configured during set_ip_block() B) invoked before ip_discovery_init() becausen ip_discovery_init() need host side prepares everything in VF FB first. current place of ip_discovery_init() is before we can invoke callback of adev->virt.ops, thus we must move ip_discovery_init() to a place after the adev->virt.ops all settle done, and the perfect place is in amdgpu_discovery_reg_base_init() Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NEmily Deng <Emily.Deng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
by this new handshake host side can prepare vbios/ip-discovery and pf&vf exchange data upon recieving this request without stopping world switch. this way the world switch is less impacted by VF's exclusive mode request Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NEmily Deng <Emily.Deng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 John Clements 提交于
added flag to ras context to indicate if ras query functionality is ready Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Monk Liu 提交于
we need to move virt detection much earlier because: 1) HW team confirms us that RCC_IOV_FUNC_IDENTIFIER will always be at DE5 (dw) mmio offset from vega10, this way there is no need to implement detect_hw_virt() routine in each nbio/chip file. for VI SRIOV chip (tonga & fiji), the BIF_IOV_FUNC_IDENTIFIER is at 0x1503 2) we need to acknowledged we are SRIOV VF before we do IP discovery because the IP discovery content will be updated by host everytime after it recieved a new coming "REQ_GPU_INIT_DATA" request from guest (there will be patches for this new handshake soon). Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NEmily Deng <Emily.Deng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Kent Russell 提交于
Allow for reading of information like manufacturer, product number and serial number from the FRU chip. Report the serial number as the new sysfs file serial_number. Note that this only works on server cards, as consumer cards do not feature the FRU chip, which contains this information. v2: Add documentation to amdgpu.rst, add helper functions, rename functions for consistency, fix bad starting offset v3: Remove testing definitions Signed-off-by: NKent Russell <kent.russell@amd.com> Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 20 3月, 2020 1 次提交
-
-
由 John Clements 提交于
MMHub EDC becomes dirty after BACO reset EDC registers should be cleared early on in reset phase Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NJohn Clements <john.clements@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 17 3月, 2020 1 次提交
-
-
由 Monk Liu 提交于
what changed: 1)provide new implementation interface for the rlcg access path 2)put SQ_CMD/SQ_IND_INDEX to GFX9 RLCG path to let debugfs's reg_op function can access reg that need RLCG path help now even debugfs's reg_op can used to dump wave. tested-by: NMonk Liu <monk.liu@amd.com> tested-by: NZhou pengju <pengju.zhou@amd.com> Signed-off-by: NZhou pengju <pengju.zhou@amd.com> Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Reviewed-by: NEmily Deng <Emily.Deng@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 3月, 2020 2 次提交
-
-
由 Evan Quan 提交于
This can fix the baco reset failure seen on Navi10. And this should be a low risk fix as the same sequence is already used for system suspend/resume. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Evan Quan 提交于
This can fix the baco reset failure seen on Navi10. And this should be a low risk fix as the same sequence is already used for system suspend/resume. Signed-off-by: NEvan Quan <evan.quan@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 05 3月, 2020 1 次提交
-
-
由 Monk Liu 提交于
1)for gfx IB test we shouldn't insert DE meta data 2)we should make sure IB test finished before we send event 3 to hypervisor otherwise the IDLE from event 3 will preempt IB test, which is not designed as a compatible structure for MCBP Signed-off-by: NMonk Liu <Monk.Liu@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 29 2月, 2020 1 次提交
-
-
由 Yintian Tao 提交于
drm_minor_unregister will invoke drm_debugfs_cleanup to clean all the child node under primary minor node. We don't need to invoke amdgpu_debugfs_fini and amdgpu_debugfs_regs_cleanup to clean agian. Otherwise, it will raise the NULL pointer like below. [ 45.046029] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 [ 45.047256] PGD 0 P4D 0 [ 45.047713] Oops: 0002 [#1] SMP PTI [ 45.048198] CPU: 0 PID: 2796 Comm: modprobe Tainted: G W OE 4.18.0-15-generic #16~18.04.1-Ubuntu [ 45.049538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 45.050651] RIP: 0010:down_write+0x1f/0x40 [ 45.051194] Code: 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8 ce d9 ff ff 48 ba 01 00 00 00 ff ff ff ff 48 89 d8 <f0> 48 0f c1 10 85 d2 74 05 e8 53 1c ff ff 65 48 8b 04 25 00 5c 01 [ 45.053702] RSP: 0018:ffffad8f4133fd40 EFLAGS: 00010246 [ 45.054384] RAX: 00000000000000a8 RBX: 00000000000000a8 RCX: ffffa011327dd814 [ 45.055349] RDX: ffffffff00000001 RSI: 0000000000000001 RDI: 00000000000000a8 [ 45.056346] RBP: ffffad8f4133fd48 R08: 0000000000000000 R09: ffffffffc0690a00 [ 45.057326] R10: ffffad8f4133fd58 R11: 0000000000000001 R12: ffffa0113cff0300 [ 45.058266] R13: ffffa0113c0a0000 R14: ffffffffc0c02a10 R15: ffffa0113e5c7860 [ 45.059221] FS: 00007f60d46f9540(0000) GS:ffffa0113fc00000(0000) knlGS:0000000000000000 [ 45.060809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.061826] CR2: 00000000000000a8 CR3: 0000000136250004 CR4: 00000000003606f0 [ 45.062913] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 45.064404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 45.065897] Call Trace: [ 45.066426] debugfs_remove+0x36/0xa0 [ 45.067131] amdgpu_debugfs_ring_fini+0x15/0x20 [amdgpu] [ 45.068019] amdgpu_debugfs_fini+0x2c/0x50 [amdgpu] [ 45.068756] amdgpu_pci_remove+0x49/0x70 [amdgpu] [ 45.069439] pci_device_remove+0x3e/0xc0 [ 45.070037] device_release_driver_internal+0x18a/0x260 [ 45.070842] driver_detach+0x3f/0x80 [ 45.071325] bus_remove_driver+0x59/0xd0 [ 45.071850] driver_unregister+0x2c/0x40 [ 45.072377] pci_unregister_driver+0x22/0xa0 [ 45.073043] amdgpu_exit+0x15/0x57c [amdgpu] [ 45.073683] __x64_sys_delete_module+0x146/0x280 [ 45.074369] do_syscall_64+0x5a/0x120 [ 45.074916] entry_SYSCALL_64_after_hwframe+0x44/0xa9 v2: remove all debugfs cleanup/fini code at amdgpu v3: squash in unused variable removal Signed-off-by: NYintian Tao <yttao@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 27 2月, 2020 6 次提交
-
-
由 Alex Deucher 提交于
We've moved the debugfs handling into a centralized place so we can remove the legacy load an unload callbacks. Tested-by: NThomas Zimmermann <tzimmermann@suse.de> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for firmware. Tested-by: NThomas Zimmermann <tzimmermann@suse.de> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for register access files. Tested-by: NThomas Zimmermann <tzimmermann@suse.de> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for gem. Tested-by: NThomas Zimmermann <tzimmermann@suse.de> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
to amdgpu_debugfs_fini. It will be used for other things in the future. Tested-by: NThomas Zimmermann <tzimmermann@suse.de> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Yong Zhao 提交于
Since emulators are slower, sometime some operations like flushing tlb through FM need more than twice the regular timout of 100ms, so increase the timeout to 1s on emulators. Signed-off-by: NYong Zhao <Yong.Zhao@amd.com> Reviewed-by: NHawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 13 2月, 2020 1 次提交
-
-
由 Rajneesh Bhardwaj 提交于
So far the kfd driver implemented same routines for runtime and system wide suspend and resume (s2idle or mem). During system wide suspend the kfd aquires an atomic lock that prevents any more user processes to create queues and interact with kfd driver and amd gpu. This mechanism created problem when amdgpu device is runtime suspended with BACO enabled. Any application that relies on kfd driver fails to load because the driver reports a locked kfd device since gpu is runtime suspended. However, in an ideal case, when gpu is runtime suspended the kfd driver should be able to: - auto resume amdgpu driver whenever a client requests compute service - prevent runtime suspend for amdgpu while kfd is in use This change refactors the amdgpu and amdkfd drivers to support BACO and runtime power management. Reviewed-by: NOak Zeng <oak.zeng@amd.com> Reviewed-by: NFelix Kuehling <felix.kuehling@amd.com> Signed-off-by: NRajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 08 2月, 2020 3 次提交
-
-
由 Christian König 提交于
This should speed up debugging VRAM access a lot. v2: add HDP flush/invalidate Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NJonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Christian König 提交于
Only write the _HI register when necessary. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com> Acked-by: NJonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Jack Zhang 提交于
For sriov and pp_onevf_mode, do not send message to set smu status, because smu doesn't support these messages under VF. Besides, it should skip smu_suspend when pp_onevf_mode is disabled. Signed-off-by: NJack Zhang <Jack.Zhang1@amd.com> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Acked-by: NEvan Quan <evan.quan@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 28 1月, 2020 2 次提交
-
-
由 Alex Deucher 提交于
Everything is in place. Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 Alex Deucher 提交于
Has been working fine for a while. Reviewed-by: NAndrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 25 1月, 2020 1 次提交
-
-
由 Chris Wilson 提交于
Since drm_global_mutex is a true global mutex across devices, we don't want to acquire it unless absolutely necessary. For maintaining the device local open_count, we can use atomic operations on the counter itself, except when making the transition to/from 0. Here, we tackle the easy portion of delaying acquiring the drm_global_mutex for the final release by using atomic_dec_and_mutex_lock(), leaving the global serialisation across the device opens. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: NThomas Hellström <thellstrom@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200124130107.125404-1-chris@chris-wilson.co.uk
-
- 23 1月, 2020 3 次提交
-
-
由 Nirmoy Das 提交于
Better clean that up before some automation starts to complain about it Signed-off-by: NNirmoy Das <nirmoy.das@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 chen gong 提交于
Reading some registers by mmio will result in hang when GPU is in "gfxoff" state.This problem can be solved by GPU in "ring command packages" way. Signed-off-by: Nchen gong <curry.gong@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
由 chen gong 提交于
Move amdgpu_virt_kiq_rreg/amdgpu_virt_kiq_wreg function to amdgpu_gfx.c, and rename them to amdgpu_kiq_rreg/amdgpu_kiq_wreg.Make it generic and flexible. Signed-off-by: Nchen gong <curry.gong@amd.com> Reviewed-by: NAlex Deucher <alexander.deucher@amd.com> Reviewed-by: NHuang Rui <ray.huang@amd.com> Acked-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-