提交 · 0f7607d484f57c31a3e0b5b4e75ff1366cc90b6b · openeuler / Kernel

27 9月, 2017 8 次提交

drm/amdgpu: Add gem_prime_mmap support · dfced2e4

由 Samuel Li 提交于 8月 22, 2017

v2: drop hdp invalidate/flush.
v3: honor pgoff during prime mmap. Add a barrier after cpu access.
v4: drop begin/end_cpu_access() for now, revisit later.
Signed-off-by: NSamuel Li <Samuel.Li@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

dfced2e4

drm/amdgpu: Add copy_pte_num_dw member in amdgpu_vm_pte_funcs · e6d92197

由 Yong Zhao 提交于 9月 19, 2017

Use it to replace the hard coded value in amdgpu_vm_bo_update_mapping().
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e6d92197

drm/amdgpu: Fix a bug in amdgpu_fill_buffer() · 7bdc53f9

由 Yong Zhao 提交于 9月 15, 2017

When max_bytes is not 8 bytes aligned and bo size is larger than
max_bytes, the last 8 bytes in a ttm node may be left unchanged.
For example, on pre SDMA 4.0, max_bytes = 0x1fffff, and the bo size
is 0x200000, the problem will happen.

In order to fix the problem, we separately store the max nums of
PTEs/PDEs a single operation can set in amdgpu_vm_pte_funcs
structure, rather than inferring it from bytes limit of SDMA
constant fill, i.e. fill_max_bytes.

Together with the fix, we replace the hard code value "10" in
amdgpu_vm_bo_update_mapping() with the corresponding values from
structure amdgpu_vm_pte_funcs.
Signed-off-by: NYong Zhao <yong.zhao@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

7bdc53f9

drm/amdgpu/sriov:fix memory leak after gpu reset · d59c026b

由 Monk Liu 提交于 9月 15, 2017

GPU reset will require all hw doing hw_init thus
ucode_init_bo will be invoked again, which lead to
memory leak

skip the fw_buf allocation during sriov gpu reset to avoid
memory leak.
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d59c026b

drm/amdgpu:make ctx_add_fence interruptible(v2) · eb01abc7

由 Monk Liu 提交于 9月 15, 2017

otherwise a gpu hang will make application couldn't be killed
under timedout=0 mode

v2:
Fix memoryleak job/job->s_fence issue
unlock mn
remove the ERROR msg after waiting being interrupted
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

eb01abc7

drm/amdgpu/sriov:move in_reset to adev and rename · 3224a12b

由 Monk Liu 提交于 9月 15, 2017

currently in_reset is only used in sriov gpu reset, and it
will be used for other non-gfx hw component later, like
PSP, so move it from gfx to adev and rename to in_sriov_reset
make more sense.
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3224a12b

drm/amdgpu: fix checkpatch.pl warning to amdgpu_drv.c · 0b693f0b

由 Rex Zhu 提交于 9月 19, 2017

fix checkpatch.pl WARNING:
Prefer 'unsigned int' to bare use of 'unsigned'
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NRex Zhu <Rex.Zhu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0b693f0b

drm/amdgpu: Add prescreening stage in IH processing (v2) · 00ecd8a2

由 Felix Kuehling 提交于 8月 26, 2017

To filter out high-frequency interrupts that can be safely ignored.

v2: squash in trivial typo fix for si (Alex)
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

00ecd8a2

13 9月, 2017 8 次提交

drm/amdgpu: move MMU notifier related defines to amdgpu_mn.h · 9a189996

由 Christian König 提交于 9月 12, 2017

Just some cleanup.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9a189996

drm/amdgpu: move amdgpu_ttm_tt_* declarations into amdgpu_ttm.h · 711becf0

由 Christian König 提交于 9月 08, 2017

Just some cleanup.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

711becf0

drm/amdgpu: keep the MMU lock until the update ends v4 · 1ed3d256

由 Christian König 提交于 9月 05, 2017

This is quite controversial because it adds another lock which is held during
page table updates, but I don't see much other option.

v2: allow multiple updates to be in flight at the same time
v3: simplify the patch, take the read side only once
v4: correctly fix rebase conflict
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1ed3d256

drm/amdgpu: move amdgpu_cs_sysvm_access_required into find_mapping · 9cca0b8e

由 Christian König 提交于 9月 06, 2017

When we need to find the mapping we need sysvm access anyway.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Acked-by: NLeo Liu <leo.liu@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9cca0b8e

drm/amdgpu: stop reserving the BO in the MMU callback v3 · 3fe89771

由 Christian König 提交于 9月 12, 2017

Instead take the callback lock during the final parts of CS.

This should solve the last remaining locking order problems with BO reservations.

v2: rebase, make dummy functions static inline
v3: add one more missing inline and comments
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3fe89771

drm/amdgpu: move userptr BOs to CPU domain during CS v2 · 1b0c0f9d

由 Christian König 提交于 9月 05, 2017

Instead of moving them in the MMU notifier move them during CS.

v2: still mark pages as accessed/dirty
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v1)
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

1b0c0f9d

drm/amdgpu: stop using BO status for user pages · ca666a3c

由 Christian König 提交于 9月 05, 2017

Instead use a counter to figure out if we need to set new pages or not.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ca666a3c

drm/amdgpu: fix userptr put_page handling · a216ab09

由 Christian König 提交于 9月 02, 2017

Move calling put_page into the unpopulate callback. Otherwise we mess up the pages
reference count when it is unbound multiple times.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a216ab09

01 9月, 2017 1 次提交

drm/amdgpu: add IOCTL interface for per VM BOs v3 · e1eb899b

由 Christian König 提交于 8月 25, 2017

Add the IOCTL interface so that applications can allocate per VM BOs.

Still WIP since not all corner cases are tested yet, but this reduces average
CS overhead for 10K BOs from 21ms down to 48us.

v2: add some extra checks, remove the WIP tag
v3: rename new flag to AMDGPU_GEM_CREATE_VM_ALWAYS_VALID
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e1eb899b

30 8月, 2017 1 次提交

drm/amdgpu: add automatic per asic settings for gart_size · 83e74db6

由 Alex Deucher 提交于 8月 21, 2017

We need a larger gart for asics that do not support GPUVM on all
engines (e.g., MM) to make sure we have enough space for all
gtt buffers in physical mode.  Change the default size based on
the asic type.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

83e74db6

18 8月, 2017 4 次提交

drm/amd/amdgpu: expose fragment size as module parameter (v2) · d07f14be

由 Roger He 提交于 8月 15, 2017

Allow overrides on the command line.

v2: agd: sqaush in spelling fix and bogus default value warning
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NRoger He <Hongbo.He@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d07f14be

drm/amdgpu: cleanup static CSA handling · 0f4b3c68

由 Christian König 提交于 7月 31, 2017

Move the CSA bo_va from the VM to the fpriv structure.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0f4b3c68

drm/amdgpu: move vram usage tracking into the vram manager v2 · 3c848bb3

由 Christian König 提交于 8月 07, 2017

Looks like a better place for this.

v2: use atomic64_t members instead
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3c848bb3

drm/amdgpu: move gtt usage tracking into the gtt manager v2 · 9255d77d

由 Christian König 提交于 8月 07, 2017

It doesn't make much sense to count those numbers twice.

v2: use and atomic64_t instead
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9255d77d

16 8月, 2017 8 次提交

drm/amdgpu: Fix stolen typo · 5af2c10d

由 Kent Russell 提交于 8月 08, 2017

Change "stollen" to "stolen"
Signed-off-by: NKent Russell <kent.russell@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5af2c10d

drm/amdgpu: use 256 bit buffers for all wb allocations (v2) · 97407b63

由 Alex Deucher 提交于 7月 28, 2017

May waste a bit of memory, but simplifies the interface
significantly.

v2: convert internal accounting to use 256bit slots
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

97407b63

drm/amdgpu/sdma4: drop allocation of poll_mem_offs · 34c3a82b

由 Alex Deucher 提交于 7月 28, 2017

We already allocate this as part of the ring structure,
use that instead.

Cc: Frank Min <Frank.Min@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

34c3a82b

drm/amdgpu: make wb 256bit function names consistent · eacf3e14

由 Alex Deucher 提交于 7月 27, 2017

Use a lower case b to be consistent with the other wb functions.
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

eacf3e14

drm/amdgpu: move some defines around · 9124a398

由 Christian König 提交于 7月 21, 2017

Move amdgpu_bo and related structures into amdgpu_object.h.

Move amdgpu_bo_list structures to the amdgpu_bo_list functions.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

9124a398

drm/amdgpu: cleanup kptr handling · f5e1c740

由 Christian König 提交于 7月 20, 2017

Don't keep around the same pointer twice.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f5e1c740

drm/amdgpu/sdma4: Enable sdma poll mem addr on vega10 for SRIOV · 51668b0b

由 Frank Min 提交于 6月 28, 2017

While doing flr on VFs, there is possibility to lost the doorbell
writing for sdma, so enable poll mem for sdma, then sdma fw would
check the pollmem holding wptr.
Signed-off-by: NFrank Min <Frank.Min@amd.com>
Signed-off-by: NXiangliang.Yu <Xiangliang.Yu@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

51668b0b

drm/amdgpu: According hardware design revert vce and uvd doorbell assignment · 4ed11d79

由 Frank Min 提交于 6月 12, 2017

Now uvd doorbell is from 0xf8-0xfb and vce doorbell is from 0xfc-0xff
Signed-off-by: NFrank Min <Frank.Min@amd.com>
Signed-off-by: NXiangliang.Yu <Xiangliang.Yu@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4ed11d79

26 7月, 2017 1 次提交

drm/amdgpu:fix gfx fence allocate size · 0915fdbc

由 Monk Liu 提交于 6月 19, 2017

1, for sriov, we need 8dw for the gfx fence due to CP
behaviour
2, cleanup wrong logic in wptr/rptr wb alloc and free

Change-Id: Ifbfed17a4621dae57244942ffac7de1743de0294
Signed-off-by: NMonk Liu <Monk.Liu@amd.com>
Signed-off-by: NXiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

0915fdbc

14 7月, 2017 9 次提交

drm/amdgpu: Make SDMA phase quantum configurable · a667386c

由 Felix Kuehling 提交于 7月 15, 2016

Set a configurable SDMA phase quantum when enabling SDMA context
switching. The default value significantly reduces SDMA latency
in page table updates when user-mode SDMA queues have concurrent
activity, compared to the initial HW setting.
Signed-off-by: NFelix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: NAndres Rodriguez <andres.rodriguez@amd.com>
Reviewed-by: NShaoyun Liu <shaoyun.liu@amd.com>
Acked-by: NChunming Zhou <david1.zhou@amd.com>
Acked-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a667386c

drm/amdgpu: Throttle visible VRAM moves separately · 00f06b24

由 John Brooks 提交于 6月 27, 2017

The BO move throttling code is designed to allow VRAM to fill quickly if it
is relatively empty. However, this does not take into account situations
where the visible VRAM is smaller than total VRAM, and total VRAM may not
be close to full but the visible VRAM segment is under pressure. In such
situations, visible VRAM would experience unrestricted swapping and
performance would drop.

Add a separate counter specifically for moves involving visible VRAM, and
check it before moving BOs there.

v2: Only perform calculations for separate counter if visible VRAM is
    smaller than total VRAM. (Michel Dänzer)
v3: [Michel Dänzer]
* Use BO's location rather than the AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED
  flag to determine whether to account a move for visible VRAM in most
  cases.
* Use a single

	if (adev->mc.visible_vram_size < adev->mc.real_vram_size) {

  block in amdgpu_cs_get_threshold_for_moves.

Fixes: 95844d20 (drm/amdgpu: throttle buffer migrations at CS using a fixed MBps limit (v2))
Signed-off-by: NJohn Brooks <john@fastquake.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

00f06b24

drm/amdgpu: Add vis_vramlimit module parameter · 218b5dcd

由 John Brooks 提交于 6月 27, 2017

Allow specifying a limit on visible VRAM via a module parameter. This is
helpful for testing performance under visible VRAM pressure.

v2: Add cast to 64-bit (Christian König)
Signed-off-by: NJohn Brooks <john@fastquake.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

218b5dcd

drm/amdgpu: change gartsize default to 256MB · f9321cc4

由 Christian König 提交于 7月 07, 2017

Limit the default GART size and save a lot of VRAM.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f9321cc4

drm/amdgpu: add new gttsize module parameter v2 · 36d38372

由 Christian König 提交于 7月 07, 2017

This allows setting the gtt size independent of the gart size.

v2: fix copy and paste typo
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

36d38372

drm/amdgpu: consistent name all GART related parts · 6f02a696

由 Christian König 提交于 7月 07, 2017

Rename symbols from gtt_ to gart_ as appropriate.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6f02a696

drm/amdgpu: remove gtt_base_align handling · ed21c047

由 Christian König 提交于 7月 06, 2017

Not used any more.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ed21c047

drm/amdgpu: move GART struct and function into amdgpu_gart.h v2 · 3490bdb5

由 Christian König 提交于 7月 06, 2017

No functional change, just cleanup.

v2: rebased, keep gart name.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

3490bdb5

drm/amd/powerplay: added new se_cac_idx r/w APIs v2 · 16abb5d2

由 Evan Quan 提交于 7月 04, 2017

  - v2: added missing spinlock init
Signed-off-by: NEvan Quan <evan.quan@amd.com>
Reviewed-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

16abb5d2

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功