提交 · f35751b87034f0c2d11e60cdfb0179c4f1a7e296 · openeuler / Kernel

16 3月, 2018 11 次提交

drm/amdkfd: Allocate CWSR trap handler memory for dGPUs · f35751b8

由 Felix Kuehling 提交于 3月 15, 2018

Add helpers for allocating GPUVM memory in kernel mode and use them
to allocate memory for the CWSR trap handler.

v2: Use dev instead of pdd->dev in kfd_process_free_gpuvm
v3:
* Cleaned up and simplified kfd_process_alloc_gpuvm
* Moved allocation for dGPU to kfd_process_device_init_vm
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

f35751b8

drm/amdkfd: Add per-process IDR for buffer handles · 52b29d73

由 Felix Kuehling 提交于 3月 15, 2018

Also used for cleaning up on process termination.

v2: Refactored cleanup on process termination
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

52b29d73

drm/amdkfd: Aperture setup for dGPUs · d01994c2

由 Felix Kuehling 提交于 3月 15, 2018

Set up the GPUVM aperture for SVM (shared virtual memory) that allows
sharing a part of virtual address space between GPUs and CPUs.

Report the size of the GPUVM aperture that is supported by KGD accurately.

The low part of the GPUVM aperture is reserved for kernel use. This is
for kernel-allocated buffers that are only accessed on the GPU:
- CWSR trap handler
- IB for submitting commands in user-mode context from kernel mode
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

d01994c2

drm/amdkfd: Remove limit on number of GPUs · c7bcbfa4

由 Felix Kuehling 提交于 3月 15, 2018

Currently the number of GPUs is limited by aperture placement options
available on GFX7 and GFX8 hardware. This limitation is not necessary.
Scratch and LDS represent per-work-item and per-work-group storage
respectively. Different work-items and work-groups use the same virtual
address to access their own data. Work running on different GPUs is by
definition in different work-groups (different dispatches, in fact).
That means the same virtual addresses can be used for these apertures
on different GPUs.

Add a new AMDKFD_IOC_GET_PROCESS_APERTURES_NEW ioctl that removes the
artificial limitation on the number of GPUs that can be supported. The
new ioctl allows user mode to query the number of GPUs to allocate
enough memory for all GPUs to be reported.

This deprecates AMDKFD_IOC_GET_PROCESS_APERTURES.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

c7bcbfa4

drm/amdkfd: Populate DRM render device minor · 7c9b7171

由 Oak Zeng 提交于 3月 15, 2018

Populate DRM render device minor in kfd topology
Signed-off-by: NOak Zeng <Oak.Zeng@amd.com>
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

7c9b7171

drm/amdkfd: Create KFD VMs on demand · b84394e2

由 Felix Kuehling 提交于 3月 15, 2018

Instead of creating all VMs on process creation, create them when
a process is bound to a device. This will later allow registering
an existing VM from a DRM render node FD at runtime, before the
process is bound to the device. This way the render node VM can be
used for KFD instead of creating our own redundant VM.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

b84394e2

drm/amdgpu: Add kfd2kgd interface to acquire an existing VM · ede0dd86

由 Felix Kuehling 提交于 3月 15, 2018

This allows acquiring an existing VM from a render node FD to use it
for a compute process.

Such VMs get destroyed when the original file descriptor is released.
Added a callback from amdgpu_vm_fini to handle KFD VM destruction
correctly in this case.

v2:
* Removed vm->vm_context check in amdgpu_amdkfd_gpuvm_destroy_cb,
  check vm->process_info earlier instead
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

ede0dd86

drm/amdgpu: Add helper to turn an existing VM into a compute VM · b236fa1d

由 Felix Kuehling 提交于 3月 15, 2018

v2: Removed updating and checking of vm->vm_context
v3: Enable amdgpu_vm_clear_bo in amdgpu_vm_make_compute
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

b236fa1d

drm/amdgpu: Fix initial validation of PD BO for KFD VMs · 3486625b

由 Felix Kuehling 提交于 3月 15, 2018

Make sure the PD BO is valid and attach the eviction fence during VM
creation. This ensures that the pd_phys_address is actually valid
and an eviction that would invalidate it triggers a KFD process
eviction like it should.

v2: Use uninterruptible waiting in initial PD validation
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

3486625b

drm/amdgpu: Move KFD-specific fields into struct amdgpu_vm · 5b21d3e5

由 Felix Kuehling 提交于 3月 15, 2018

Remove struct amdkfd_vm and move the fields into struct amdgpu_vm.
This will allow turning a VM created by a DRM render node into a
KFD VM.

v2: Removed vm_context field
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

5b21d3e5

drm/amdkfd: fix uninitialized variable use · 48a44387

由 Arnd Bergmann 提交于 3月 15, 2018

When CONFIG_ACPI is disabled, we never initialize the acpi_table
structure in kfd_create_crat_image_virtual:

drivers/gpu/drm/amd/amdkfd/kfd_crat.c: In function 'kfd_create_crat_image_virtual':
drivers/gpu/drm/amd/amdkfd/kfd_crat.c:888:40: error: 'acpi_table' may be used uninitialized in this function [-Werror=maybe-uninitialized]

The undefined behavior also happens for any other acpi_get_table()
failure, but then the compiler can't warn about it.

This adds an error check that prevents the structure from
being used in error, avoiding both the undefined behavior and
the warning about it.

Fixes: 520b8fb7 ("drm/amdkfd: Add topology support for CPUs")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

48a44387

15 3月, 2018 1 次提交

drm/amdkfd: add missing include of mm.h · 7420f482

由 Oded Gabbay 提交于 3月 15, 2018

This patch fixes kernel build in ARCH=frv
Signed-off-by: NOded Gabbay <oded.gabbay@gmail.com>

7420f482

23 3月, 2018 9 次提交

drm/amd/pp: clean header file hwmgr.h · 09695ad7

由 Rex Zhu 提交于 3月 22, 2018

Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

09695ad7

drm/amd/pp: use mlck_table.count for array loop index limit · 5b293355

由 Colin Ian King 提交于 3月 21, 2018

v2: use temporaries to trivially reduces object size.

The for-loops process data in the mclk_table but use slck_table.count
as the loop index limit.  I believe these are cut-n-paste errors from
the previous almost identical loops as indicated by static analysis.
Fix these.

Detected by CoverityScan, CID#1466001 ("Copy-paste error")

Fixes: 5d97cf39 ("drm/amd/pp: Add and initialize OD_dpm_table for CI/VI.")
Fixes: 5e4d4fbe ("drm/amd/pp: Implement edit_dpm_table on smu7")
Reviewed-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5b293355

drm/amdgpu: Add an ATPX quirk for hybrid laptop · 13b40935

由 Alex Deucher 提交于 3月 21, 2018

_PR3 doesn't seem to work properly, use ATPX instead.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=104064Reviewed-by: NHuang Rui <ray.huang@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

13b40935

drm/amdgpu: fix spelling mistake: "asssert" -> "assert" · 36b3f84a

由 Colin Ian King 提交于 3月 22, 2018

Trivial fix to spelling mistake in pr_err error message text
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NColin Ian King <colin.king@canonical.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

36b3f84a

drm/amd/pp: Add new asic support in pp_psm.c · 8ebde09b

由 Rex Zhu 提交于 3月 21, 2018

In new asics(vega12), no power state management in driver,
So no need to implement related callback functions.
and add some ps checks in pp_psm.c

Revert "drm/amd/powerplay: add new pp_psm infrastructure for vega12 (v2)"
This reverts commit 7d1a63f3aa331b853e41f92d0e7890ed31de8c13.
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

8ebde09b

drm/amd/pp: Clean up powerplay code on Vega12 · bbfcc8af

由 Rex Zhu 提交于 3月 21, 2018

Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

bbfcc8af

drm/amd/pp: Add smu irq handlers for legacy asics · 031ec948

由 Rex Zhu 提交于 3月 21, 2018

Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

031ec948

drm/amd/pp: Fix set wrong temperature range on smu7 · 3c796843

由 Rex Zhu 提交于 3月 21, 2018

Fix the issue thermal irq was always triggered
as GPU under temperature range detected

The low temp in default thermal policy
was set to -273. so need to use int type for the low temp.
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3c796843

drm/amdgpu: Don't change preferred domian when fallback GTT v5 · cc15dfaa

由 Chunming Zhou 提交于 3月 16, 2018

v2: add sanity checking
v3: make code open
v4: also handle visible to invisible fallback
v5: Since two fallback cases, re-use goto retry
Signed-off-by: NChunming Zhou <david1.zhou@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

cc15dfaa

22 3月, 2018 19 次提交

T
drm/vmwgfx: Bump version patchlevel and date · 43bfefed
由 Thomas Hellstrom 提交于 3月 22, 2018
```
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
```
43bfefed

drm/vmwgfx: use monotonic event timestamps · 37efe80c

由 Arnd Bergmann 提交于 1月 16, 2018

DRM_VMW_EVENT_FENCE_SIGNALED (struct drm_vmw_event_fence) and
DRM_EVENT_VBLANK (struct drm_event_vblank) pass timestamps in 32-bit
seconds/microseconds format.

As of commit c61eef72 ("drm: add support for monotonic vblank
timestamps"), other DRM drivers use monotonic times for drm_event_vblank,
but vmwgfx still uses CLOCK_REALTIME for both events, which suffers from
the y2038/y2106 overflow as well as time jumps.

For consistency, this changes vmwgfx to use ktime_get_ts64 as well,
which solves those problems and avoids the deprecated do_gettimeofday()
function.

This should be transparent to to user space, as long as it doesn't
compare the time against the result of gettimeofday().
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>

37efe80c

drm/vmwgfx: Unpin the screen object backup buffer when not used · 20fb5a63

由 Thomas Hellstrom 提交于 3月 22, 2018

We were relying on the pinned screen object backup buffer to be destroyed
when not used. But if we hold a copy of the atomic state, like when
hibernating, the backup buffer might not be destroyed since it's
refcounted by the atomic state. This causes us to hibernate with a
buffer pinned in VRAM.

Fix this by only having the buffer pinned when it is actually used by a
screen object.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: NBrian Paul <brianp@vmware.com>
Reviewed-by: NSinclair Yeh <syeh@vmware.com>

20fb5a63

drm/vmwgfx: Stricter count of legacy surface device resources · 89dc15b7

由 Thomas Hellstrom 提交于 3月 22, 2018

For legacy surfaces, they were previously registered as device resources
when the driver resources were created. Since they are evictable we instead
register them as device resources once they are created on the device,
just like for guest-backed surfaces. This has implications during
hibernation where we can't hibernate with device resources active.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: NBrian Paul <brianp@vmware.com>
Reviewed-by: NDeepak Rawat <drawat@vmware.com>
Reviewed-by: NSinclair Yeh <syeh@vmware.com>

89dc15b7

drm/vmwgfx: Use kasprintf · 6073a092

由 Himanshu Jha 提交于 3月 22, 2018

Use kasprintf instead of combination of kmalloc and sprintf. Also,
remove the local variables used for storing the string length as they
are not required now.
Signed-off-by: NHimanshu Jha <himanshujha199640@gmail.com>
Reviewed-by: NSinclair Yeh <syeh@vmware.com>
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>

6073a092

drm/vmwgfx: Get rid of the device-private suspended member · 4e3e733b

由 Thomas Hellstrom 提交于 3月 22, 2018

It was used to early block fbdev dirty processing. Replace it with an
unprotected check of the par->dirty.active field. While this might
race with the vmw_fb_off() function, we do a protected check later so
the race will at worst lead to grabbing and releasing a couple of locks.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: NBrian Paul <brianp@vmware.com>
Reviewed-by: NSinclair Yeh <syeh@vmware.com>

4e3e733b

drm/vmwgfx: Improve on hibernation · c3b9b165

由 Thomas Hellstrom 提交于 3月 22, 2018

Make it possible to hibernate also with masters that don't switch VT at
hibernation time. We save and restore modesetting state unless fbdev is
active and enabled at hibernation time.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: NSinclair Yeh <syeh@vmware.com>

c3b9b165

drm/vmwgfx: Avoid pinning fbdev framebuffers · bf833fd3

由 Thomas Hellstrom 提交于 3月 22, 2018

fbdev framebuffers were previously pinned to be able to keep them mapped
across updates.

This commit introduces a mechanism that instead revalidates the map on
each update, keeping the map cached across updates. The cached map is torn
down if the underlying pages change. Typically on buffer object moves and
swapouts.

This should be nicer to the system when we have resource contention.

Testing done: Basic fbdev functionality under Fedora 27.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: NSinclair Yeh <syeh@vmware.com>
Reviewed-by: NBrian Paul <brianp@vmware.com>
Reviewed-by: NDeepak Rawat <drawat@vmware.com>

bf833fd3

drm/vmwgfx: Fix multiple command buffer context use · dc366364

由 Thomas Hellstrom 提交于 3月 22, 2018

The start / stop and preempt commands don't honor the context argument
but rather acts on all available contexts.

Also add detection for context 1 availability.

Note that currently there's no driver interface for submitting buffers
using the high-priority command queue (context 1).

Testing done:
Change the default context for command submission to 1 instead of 0,
verify basic desktop functionality including faulty command injection and
recovery.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: NBrian Paul <brianp@vmware.com>
Reviewed-by: NDeepak Rawat <drawat@vmware.com>

dc366364

drm/vmwgfx: Use the cpu blit utility for framebuffer to screen target blits · ef86cfee

由 Thomas Hellstrom 提交于 1月 16, 2018

This blit was previously performed using two large vmaps, one of which
was teared down and remapped on each blit. Use the more resource-
conserving TTM cpu blit instead.

The blit is used in boundary-box computing mode which makes it possible
to minimize the bounding box used in host operations.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: NBrian Paul <brianp@vmware.com>

ef86cfee

drm/vmwgfx: Add a cpu blit utility that can be used for page-backed bos · 79273e1b

由 Thomas Hellstrom 提交于 1月 16, 2018

The utility uses kmap_atomic() instead of vmapping the whole buffer
object. As a result there will be more book-keeping but on some
architectures this will help avoid exhausting vmalloc space and also
avoid expensive TLB flushes.

The blit utility also adds a provision to compute a bounding box of
changed content, which is very useful to optimize presentation speed
of ill-behaved applications that don't supply proper damage regions, and
for page-flips. The cost of computing the bounding box is not that
expensive when done in a cpu-blit utility like this.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: NBrian Paul <brianp@vmware.com>

79273e1b

drm/ttm: Export the ttm_k[un]map_atomic_prot API. · 9c11fcf1

由 Thomas Hellstrom 提交于 1月 16, 2018

It will be used by vmwgfx cpu blit.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NBrian Paul <brianp@vmware.com>
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>

9c11fcf1

drm/ttm: Clean up kmap_atomic_prot selection code · 403c1826

由 Thomas Hellstrom 提交于 1月 16, 2018

Use helpers to perform the kmap_atomic_prot() functionality to
a) Avoid in-function ifdefs that violate the kernel coding policy,
b) Facilitate exporting the functionality.

This commit should not change any functionality.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>

403c1826

drm/etnaviv: bump HW job limit to 4 · 4ed75c3e

由 Lucas Stach 提交于 3月 09, 2018

The current limit of 2 leads to some GPU idle times, as the usual
IRQ latency leads to up to 3 jobs getting signaled at once with some
standard workloads.

A larger HW job limit might lead to slightly worse QoS, but we accept
that to not sacrifice GPU throughput in the common case.
Signed-off-by: NLucas Stach <l.stach@pengutronix.de>

4ed75c3e

drm/vmwgfx: Cursor update fixes · 25db8754

由 Thomas Hellstrom 提交于 1月 16, 2018

Use drm_plane_helper_check_update also for the cursor plane.
Some applications, like gdm on gnome shell still uses cursor front-buffer
like rendering without notifying the kernel. We do need some kind of
noficiation, but work around this for now by updating the cursor image on
every cursor move.
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: NBrian Paul <brianp@vmware.com>

25db8754

drm/vmwgfx: Send the correct nonblock option for atomic_commit · 904efd9e

由 Deepak Rawat 提交于 1月 16, 2018

Page flip can be slow for vmwgfx in some cases, like need to do surface
copy to different surface or waiting for IN_FENCE_FD. Enabling
nonblocking commits for vmwgfx in case userspace request it.
Signed-off-by: NDeepak Rawat <drawat@vmware.com>
Reviewed-by: NSinclair Yeh <syeh@vmware.com>
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>

904efd9e

drm/vmwgfx: Move the stdu vblank event to atomic function · ac3069e6

由 Deepak Rawat 提交于 1月 16, 2018

Atomic ioctl can also send the same page flip flags as legacy ioctl.
In those cases also need to send the vblank event to userspace.

vmwgfx does not support flag DRM_MODE_PAGE_FLIP_ASYNC, so this flag is
never expected.
Signed-off-by: NDeepak Rawat <drawat@vmware.com>
Reviewed-by: NSinclair Yeh <syeh@vmware.com>
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>

ac3069e6

drm/vmwgfx: Move screen object page flip to atomic function · aa64b3f1

由 Deepak Rawat 提交于 1月 16, 2018

The dmabuf_dirty/surface_dirty in case of screen object is moved to
plane atomic update, so that page flip in atomic ioctl also works.

vmwgfx does not support DRM_MODE_PAGE_FLIP_ASYNC, so this flag is never
expected.
Signed-off-by: NDeepak Rawat <drawat@vmware.com>
Reviewed-by: NSinclair Yeh <syeh@vmware.com>
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>

aa64b3f1

drm/vmwgfx: Remove drm_crtc_arm_vblank_event from atomic flush · 3cbe87fc

由 Deepak Rawat 提交于 1月 16, 2018

The function drm_crtc_arm_vblank_event should be used for the driver
which have vblank interrupt support. In case of vmwgfx we do not have
vblank interrupt.
Signed-off-by: NDeepak Rawat <drawat@vmware.com>
Reviewed-by: NSinclair Yeh <syeh@vmware.com>
Signed-off-by: NThomas Hellstrom <thellstrom@vmware.com>

3cbe87fc

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功