提交 · 3c848bb38aca1f7fd23edeb867b89d714a2e6ce2 · openanolis / cloud-kernel

18 8月, 2017 2 次提交

drm/amdgpu: move vram usage tracking into the vram manager v2 · 3c848bb3

由 Christian König 提交于 8月 07, 2017

Looks like a better place for this.

v2: use atomic64_t members instead
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3c848bb3

drm/amdgpu: move gtt usage tracking into the gtt manager v2 · 9255d77d

由 Christian König 提交于 8月 07, 2017

It doesn't make much sense to count those numbers twice.

v2: use and atomic64_t instead
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9255d77d

16 8月, 2017 8 次提交

drm/amdgpu: Fix stolen typo · 5af2c10d

由 Kent Russell 提交于 8月 08, 2017

Change "stollen" to "stolen"
Signed-off-by: NKent Russell <kent.russell@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5af2c10d

drm/amdgpu: use 256 bit buffers for all wb allocations (v2) · 97407b63

由 Alex Deucher 提交于 7月 28, 2017

May waste a bit of memory, but simplifies the interface
significantly.

v2: convert internal accounting to use 256bit slots
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

97407b63

drm/amdgpu/sdma4: drop allocation of poll_mem_offs · 34c3a82b

由 Alex Deucher 提交于 7月 28, 2017

We already allocate this as part of the ring structure,
use that instead.

Cc: Frank Min <Frank.Min@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

34c3a82b

drm/amdgpu: make wb 256bit function names consistent · eacf3e14

由 Alex Deucher 提交于 7月 27, 2017

Use a lower case b to be consistent with the other wb functions.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

eacf3e14

drm/amdgpu: move some defines around · 9124a398

由 Christian König 提交于 7月 21, 2017

Move amdgpu_bo and related structures into amdgpu_object.h.

Move amdgpu_bo_list structures to the amdgpu_bo_list functions.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9124a398

drm/amdgpu: cleanup kptr handling · f5e1c740

由 Christian König 提交于 7月 20, 2017

Don't keep around the same pointer twice.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f5e1c740

drm/amdgpu/sdma4: Enable sdma poll mem addr on vega10 for SRIOV · 51668b0b

由 Frank Min 提交于 6月 28, 2017

While doing flr on VFs, there is possibility to lost the doorbell
writing for sdma, so enable poll mem for sdma, then sdma fw would
check the pollmem holding wptr.
Signed-off-by: NFrank Min <Frank.Min@amd.com>
Signed-off-by: NXiangliang.Yu <Xiangliang.Yu@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

51668b0b

drm/amdgpu: According hardware design revert vce and uvd doorbell assignment · 4ed11d79

由 Frank Min 提交于 6月 12, 2017

Now uvd doorbell is from 0xf8-0xfb and vce doorbell is from 0xfc-0xff
Signed-off-by: NFrank Min <Frank.Min@amd.com>
Signed-off-by: NXiangliang.Yu <Xiangliang.Yu@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4ed11d79

26 7月, 2017 1 次提交

drm/amdgpu:fix gfx fence allocate size · 0915fdbc

由 Monk Liu 提交于 6月 19, 2017

1, for sriov, we need 8dw for the gfx fence due to CP
behaviour
2, cleanup wrong logic in wptr/rptr wb alloc and free

Change-Id: Ifbfed17a4621dae57244942ffac7de1743de0294
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Signed-off-by: NXiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0915fdbc

14 7月, 2017 15 次提交

drm/amdgpu: Make SDMA phase quantum configurable · a667386c

由 Felix Kuehling 提交于 7月 15, 2016

Set a configurable SDMA phase quantum when enabling SDMA context
switching. The default value significantly reduces SDMA latency
in page table updates when user-mode SDMA queues have concurrent
activity, compared to the initial HW setting.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NAndres Rodriguez <andres.rodriguez@amd.com>
Reviewed-by: NShaoyun Liu <shaoyun.liu@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a667386c

drm/amdgpu: Throttle visible VRAM moves separately · 00f06b24

由 John Brooks 提交于 6月 27, 2017

The BO move throttling code is designed to allow VRAM to fill quickly if it
is relatively empty. However, this does not take into account situations
where the visible VRAM is smaller than total VRAM, and total VRAM may not
be close to full but the visible VRAM segment is under pressure. In such
situations, visible VRAM would experience unrestricted swapping and
performance would drop.

Add a separate counter specifically for moves involving visible VRAM, and
check it before moving BOs there.

v2: Only perform calculations for separate counter if visible VRAM is
    smaller than total VRAM. (Michel Dänzer)
v3: [Michel Dänzer]
* Use BO's location rather than the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED
  flag to determine whether to account a move for visible VRAM in most
  cases.
* Use a single

	if (adev->mc.visible_vram_size < adev->mc.real_vram_size) {

  block in amdgpu_cs_get_threshold_for_moves.

Fixes: 95844d20 (drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2))
Signed-off-by: NJohn Brooks <john@fastquake.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

00f06b24

drm/amdgpu: Add vis_vramlimit module parameter · 218b5dcd

由 John Brooks 提交于 6月 27, 2017

Allow specifying a limit on visible VRAM via a module parameter. This is
helpful for testing performance under visible VRAM pressure.

v2: Add cast to 64-bit (Christian König)
Signed-off-by: NJohn Brooks <john@fastquake.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

218b5dcd

drm/amdgpu: change gartsize default to 256MB · f9321cc4

由 Christian König 提交于 7月 07, 2017

Limit the default GART size and save a lot of VRAM.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f9321cc4

drm/amdgpu: add new gttsize module parameter v2 · 36d38372

由 Christian König 提交于 7月 07, 2017

This allows setting the gtt size independent of the gart size.

v2: fix copy and paste typo
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

36d38372

drm/amdgpu: consistent name all GART related parts · 6f02a696

由 Christian König 提交于 7月 07, 2017

Rename symbols from gtt_ to gart_ as appropriate.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6f02a696

drm/amdgpu: remove gtt_base_align handling · ed21c047

由 Christian König 提交于 7月 06, 2017

Not used any more.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ed21c047

drm/amdgpu: move GART struct and function into amdgpu_gart.h v2 · 3490bdb5

由 Christian König 提交于 7月 06, 2017

No functional change, just cleanup.

v2: rebased, keep gart name.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3490bdb5

drm/amd/powerplay: added new se_cac_idx r/w APIs v2 · 16abb5d2

由 Evan Quan 提交于 7月 04, 2017

  - v2: added missing spinlock init
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

16abb5d2

drm/amdgpu: add amdgpu_gart_map function v2 · 0c2c421e

由 Christian König 提交于 6月 29, 2017

This allows us to write the mapped PTEs into
an IB instead of the table directly.

v2: fix build with debugfs enabled, remove unused assignment
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0c2c421e

drm/amdgpu: add workaround for S3 issues on some vega10 boards · 47ed4e1c

由 Ken Wang 提交于 7月 04, 2017

Certain MC registers need a delay after writing them to properly
update in the init sequence.
Signed-off-by: NKen Wang <Ken.Wang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

47ed4e1c

drm/amdgpu: move ring helpers to amdgpu_ring.h · e8110b1c

由 Christian König 提交于 6月 28, 2017

Keep them where they belong.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Acked-by: NFelix Kuehling <Felix.Kuehling@amd.com>

e8110b1c

drm/amdgpu: cleanup initializing gtt_size · 011d4bbe

由 Christian König 提交于 6月 26, 2017

Stop spreading the code over all GMC generations.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

011d4bbe

drm/amdgpu: remove *_mc_access from display funcs · e4f6b39e

由 Alex Deucher 提交于 12月 08, 2016

These are no longer needed now that we use the fb_location
programmed by the vbios.
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e4f6b39e

drm/amdgpu: drop set_vga_render_state from display funcs · b3fba8ad

由 Alex Deucher 提交于 11月 22, 2016

Not used.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

b3fba8ad

30 6月, 2017 2 次提交

drm/amdgpu: Make amdgpu_cs_parser_init static (v2) · 9211c784

由 Alex Xie 提交于 6月 20, 2017

The function is called only once inside the .c file.
v2: update the commit message (Michel)
Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9211c784

drm/amdgpu: Fix the exported always on CU bitmap · dbfe85ea

由 Flora Cui 提交于 6月 20, 2017

Newer asics with 4 SEs are not able to fit the entire bitmask in the
original field, use an array instead.

v2: keep cu_ao_mask for backward compatibility.
Signed-off-by: NFlora Cui <Flora.Cui@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dbfe85ea

20 6月, 2017 1 次提交

drm/amdgpu: Optimize mutex usage (v4) · 5ac55629

由 Alex Xie 提交于 6月 16, 2017

In original function amdgpu_bo_list_get, the waiting
for result->lock can be quite long while mutex
bo_list_lock was holding. It can make other tasks
waiting for bo_list_lock for long period.

Secondly, this patch allows several tasks(readers of idr)
to proceed at the same time.

v2: use rcu and kref (Dave Airlie and Christian König)
v3: update v1 commit message (Michel Dänzer)
v4: rebase on upstream (Alex Deucher)
Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>

5ac55629

17 6月, 2017 1 次提交

amdgpu: use drm sync objects for shared semaphores (v6) · 660e8558

由 Dave Airlie 提交于 3月 13, 2017

This creates a new command submission chunk for amdgpu
to add in and out sync objects around the submission.

Sync objects are managed via the drm syncobj ioctls.

The command submission interface is enhanced with two new
chunks, one for syncobj pre submission dependencies,
and one for post submission sync obj signalling,
and just takes a list of handles for each.

This is based on work originally done by David Zhou at AMD,
with input from Christian Konig on what things should look like.

In theory VkFences could be backed with sync objects and
just get passed into the cs as syncobj handles as well.

NOTE: this interface addition needs a version bump to expose
it to userspace.

TODO: update to dep_sync when rebasing onto amdgpu master.
(with this - r-b from Christian)

v1.1: keep file reference on import.
v2: move to using syncobjs
v2.1: change some APIs to just use p pointer.
v3: make more robust against CS failures, we now add the
wait sems but only remove them once the CS job has been
submitted.
v4: rewrite names of API and base on new syncobj code.
v5: move post deps earlier, rename some apis
v6: lookup post deps earlier, and just replace fences
in post deps stage (Christian)
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

660e8558

15 6月, 2017 2 次提交

drm/amdgpu: fix missed gpu info firmware when cache firmware during S3 · ab4fe3e1

由 Huang Rui 提交于 6月 05, 2017

gpu_info firmware is released after data is used. But when system enters into
suspend, upper class driver will cache all firmware names. At that time,
gpu_info will be failing to load. It seems an upper class issue, that we should
not release gpu_info firmware until device finished.

[  903.236589] cache_firmware: amdgpu/vega10_sdma1.bin
[  903.236590] fw_set_page_data: fw-amdgpu/vega10_sdma1.bin buf=ffff88041eee10c0 data=ffffc90002561000 size=17408
[  903.236591] cache_firmware: amdgpu/vega10_sdma1.bin ret=0
[  903.464160] __allocate_fw_buf: fw-amdgpu/vega10_gpu_info.bin buf=ffff88041eee2c00
[  903.471815] (NULL device *): loading /lib/firmware/updates/4.11.0-custom/amdgpu/vega10_gpu_info.bin failed with error -2
[  903.482870] (NULL device *): loading /lib/firmware/updates/amdgpu/vega10_gpu_info.bin failed with error -2
[  903.492716] (NULL device *): loading /lib/firmware/4.11.0-custom/amdgpu/vega10_gpu_info.bin failed with error -2
[  903.503156] (NULL device *): direct-loading amdgpu/vega10_gpu_info.bin
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ab4fe3e1

drm/amdgpu: add new member in gpu_info fw · 51fd0370

由 Hawking Zhang 提交于 6月 09, 2017

Signed-off-by: NHawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

51fd0370

09 6月, 2017 2 次提交

drm/amdgpu: remove duplicate function prototypes · a7dba648

由 Alex Xie 提交于 6月 08, 2017

There are two identical function prototypes in same header file
Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a7dba648

drm/amdgpu: Add vm context module param · 9a4b7d4c

由 Harish Kasiviswanathan 提交于 6月 09, 2017

Add VM update mode module param (amdgpu.vm_update_mode) that can used to
control how VM pde/pte are updated for Graphics and Compute.

BIT0 controls Graphics and BIT1 Compute.
 BIT0 [= 0] Graphics updated by SDMA [= 1] by CPU
 BIT1 [= 0] Compute updated by SDMA [= 1] by CPU

By default, only for large BAR system vm_update_mode = 2, indicating
that Graphics VMs will be updated via SDMA and Compute VMs will be
updated via CPU. And for all all other systems (by default)
vm_update_mode = 0
Signed-off-by: NHarish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9a4b7d4c

08 6月, 2017 3 次提交

drm/amdgpu: Add module param to control SI support · 6dd13096

由 Felix Kuehling 提交于 6月 05, 2017

If AMDGPU supports SI, add a module parameter to control SI
support. It's off by default in AMDGPU as long as SI suppost is
experimental, while it is on by default in radeon.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Acked-by: NMichel Dänzer <michel.daenzer@amd.com>

[ Michel Dänzer: Squash in amdgpu_si_support initialization fix ]
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>

6dd13096

drm/amdgpu: Add module param to control CIK support · 7df28986

由 Felix Kuehling 提交于 6月 05, 2017

If AMDGPU supports CIK, add a module parameter to control CIK
support. It's on by default in AMDGPU, while it is off by default
in radeon.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Acked-by: NMichel Dänzer <michel.daenzer@amd.com>

7df28986

drm/amdgpu: move mec queue helpers to amdgpu_gfx.h · 2db0cdbe

由 Alex Deucher 提交于 6月 07, 2017

They are gfx related, not general helpers.
Reviewed-by: NAlex Xie <AlexBin.Xie@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2db0cdbe

07 6月, 2017 1 次提交

drm/amdgpu: fix the gart table cleared issue for S3 · 916910ad

由 Huang Rui 提交于 5月 31, 2017

Something writes over the first 8 MB so reserve this
on vega10 until we root cause it.
Signed-off-by: NHuang Rui <ray.huang@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

916910ad

01 6月, 2017 2 次提交

drm/amdgpu: implement lru amdgpu_queue_mgr policy for compute v4 · 795f2813

由 Andres Rodriguez 提交于 3月 06, 2017

Use an LRU policy to map usermode rings to HW compute queues.

Most compute clients use one queue, and usually the first queue
available. This results in poor pipe/queue work distribution when
multiple compute apps are running. In most cases pipe 0 queue 0 is
the only queue that gets used.

In order to better distribute work across multiple HW queues, we adopt
a policy to map the usermode ring ids to the LRU HW queue.

This fixes a large majority of multi-app compute workloads sharing the
same HW queue, even though 7 other queues are available.

v2: use ring->funcs->type instead of ring->hw_ip
v3: remove amdgpu_queue_mapper_funcs
v4: change ring_lru_list_lock to spinlock, grab only once in lru_get()
Signed-off-by: NAndres Rodriguez <andresx7@gmail.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

795f2813

drm/amdgpu: untie user ring ids from kernel ring ids v6 · effd924d

由 Andres Rodriguez 提交于 2月 16, 2017

Add amdgpu_queue_mgr, a mechanism that allows disjointing usermode's
ring ids from the kernel's ring ids.

The queue manager maintains a per-file descriptor map of user ring ids
to amdgpu_ring pointers. Once a map is created it is permanent (this is
required to maintain FIFO execution guarantees for a context's ring).

Different queue map policies can be configured for each HW IP.
Currently all HW IPs use the identity mapper, i.e. kernel ring id is
equal to the user ring id.

The purpose of this mechanism is to distribute the load across multiple
queues more effectively for HW IPs that support multiple rings.
Userspace clients are unable to check whether a specific resource is in
use by a different client. Therefore, it is up to the kernel driver to
make the optimal choice.

v2: remove amdgpu_queue_mapper_funcs
v3: made amdgpu_queue_mgr per context instead of per-fd
v4: add context_put on error paths
v5: rebase and include new IPs UVD_ENC & VCN_*
v6: drop unused amdgpu_ring_is_valid_index (Alex)
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAndres Rodriguez <andresx7@gmail.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

effd924d

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功