提交 · 3bc980bf19bb62007e923691fa2869ba113be895 · openeuler / raspberrypi-kernel

19 6月, 2015 1 次提交

drm/radeon: Add RADEON_INFO_VA_UNMAP_WORKING query · 3bc980bf

由 Michel Dänzer 提交于 6月 16, 2015

This tells userspace that it's safe to use the RADEON_VA_UNMAP operation
of the DRM_RADEON_GEM_VA ioctl.

Cc: stable@vger.kernel.org
(NOTE: Backporting this commit requires at least backports of commits
26d4d129,
48afbd70 and
c29c0876 as well, otherwise using
RADEON_VA_UNMAP runs into trouble)
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NChristian König <christian.koenig@amd.com>

3bc980bf

20 3月, 2015 3 次提交

drm/radeon: add support for read reg query from radeon info ioctl · 4535cb9c

由 Alex Deucher 提交于 10月 01, 2014

This allows us to query certain registers from userspace
for profiling and harvest configuration.  E.g., it can
be used by the GALLIUM_HUD for profiling the status of
various gfx blocks.
Tested-by: NMarek Olšák <marek.olsak@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4535cb9c

drm/radeon: add INFO query for current sclk/mclk · 5c363a86

由 Alex Deucher 提交于 9月 30, 2014

Allow the UMDs to query the current sclk/mclk
for profiling, etc.
Tested-by: NMarek Olšák <marek.olsak@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

5c363a86

drm/radeon: add INFO query for GPU temperature · d6d2a188

由 Alex Deucher 提交于 9月 30, 2014

Useful for profiling.
Tested-by: NMarek Olšák <marek.olsak@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

d6d2a188

10 9月, 2014 2 次提交

drm/radeon: add RADEON_GEM_NO_CPU_ACCESS BO creation flag (v4) · f266f04d

由 Alex Deucher 提交于 8月 28, 2014

Allows pinning of buffers in the non-CPU visible portion of
vram.

v2: incorporate Michel's comments.
v3: rebase on Michel's patch
v4: rebase on Michel's v2 patch
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>

f266f04d

drm/radeon: Add RADEON_GEM_CPU_ACCESS BO creation flag · c8584039

由 Michel Dänzer 提交于 8月 28, 2014

This flag is a hint that userspace expects the BO to be accessed by the
CPU. We can use that hint to prevent such BOs from ever being stored in
the CPU inaccessible part of VRAM.
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

c8584039

19 8月, 2014 1 次提交

drm/radeon: properly document reloc priority mask · 701e1e78

由 Christian König 提交于 8月 15, 2014

Instead of hard coding the value properly document
that this is an userspace interface.

No intended functional change.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

701e1e78

11 8月, 2014 4 次提交

drm/radeon: add userptr flag to register MMU notifier v3 · 341cb9e4

由 Christian König 提交于 8月 07, 2014

Whenever userspace mapping related to our userptr change
we wait for it to become idle and unmap it from GTT.

v2: rebased, fix mutex unlock in error path
v3: improve commit message
Signed-off-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

341cb9e4

drm/radeon: add userptr flag to directly validate the BO to GTT · 2a84a447

由 Christian König 提交于 8月 07, 2014

This way we test userptr availability at BO creation time instead of first use.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2a84a447

drm/radeon: add userptr flag to limit it to anonymous memory v2 · ddd00e33

由 Christian König 提交于 8月 07, 2014

Avoid problems with writeback by limiting userptr to anonymous memory.

v2: add commit and code comments
Signed-off-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

ddd00e33

drm/radeon: add userptr support v8 · f72a113a

由 Christian König 提交于 8月 07, 2014

This patch adds an IOCTL for turning a pointer supplied by
userspace into a buffer object.

It imposes several restrictions upon the memory being mapped:

1. It must be page aligned (both start/end addresses, i.e ptr and size).

2. It must be normal system memory, not a pointer into another map of IO
space (e.g. it must not be a GTT mmapping of another object).

3. The BO is mapped into GTT, so the maximum amount of memory mapped at
all times is still the GTT limit.

4. The BO is only mapped readonly for now, so no write support.

5. List of backing pages is only acquired once, so they represent a
snapshot of the first use.

Exporting and sharing as well as mapping of buffer objects created by
this function is forbidden and results in an -EPERM.

v2: squash all previous changes into first public version
v3: fix tabs, map readonly, don't use MM callback any more
v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages,
    pin/unpin pages on bind/unbind instead of populate/unpopulate
v5: rebased on 3.17-wip, IOCTL renamed to userptr, reject any unknown
    flags, better handle READONLY flag, improve permission check
v6: fix ptr cast warning, use set_page_dirty/mark_page_accessed on unpin
v7: add warning about it's availability in the API definition
v8: drop access_ok check, fix VM mapping bits
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v4)
Reviewed-by: Jérôme Glisse <jglisse@redhat.com> (v4)
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f72a113a

05 8月, 2014 1 次提交

drm/radeon: Pass GART page flags to radeon_gart_set_page() explicitly · 77497f27

由 Michel Dänzer 提交于 7月 17, 2014

Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

77497f27

10 6月, 2014 1 次提交

drm/radeon: add query for number of active CUs · 65fcf668

由 Alex Deucher 提交于 6月 02, 2014

Query to find out how many compute units on a GPU.
Useful for OpenCL usermode drivers.
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

65fcf668

03 3月, 2014 2 次提交

drm/radeon: track memory statistics about VRAM and GTT usage and buffer moves v2 · 67e8e3f9

由 Marek Olšák 提交于 3月 02, 2014

The statistics are:
- VRAM usage in bytes
- GTT usage in bytes
- number of bytes moved by TTM

The last one is actually a counter, so you need to sample it before and after
command submission and take the difference.

This is useful for finding performance bottlenecks. Userspace queries are
also added.

v2: use atomic64_t
Signed-off-by: NMarek Olšák <marek.olsak@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>

67e8e3f9

drm/radeon: add a way to get and set initial buffer domains v2 · bda72d58

由 Marek Olšák 提交于 3月 02, 2014

When passing buffers between processes, the receiving process needs to know
the original buffer domain, so that it doesn't accidentally move the buffer.

v2: reserve the buffer
Signed-off-by: NMarek Olšák <marek.olsak@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>

bda72d58

18 2月, 2014 2 次提交

C
drm/radeon: add VCE version parsing and checking · 98ccc291
由 Christian König 提交于 1月 23, 2014
```
Also make the result available to userspace.
Signed-off-by: NChristian König <christian.koenig@amd.com>
```
98ccc291

drm/radeon: initial VCE support v4 · d93f7937

由 Christian König 提交于 5月 23, 2013

Only VCE 2.0 support so far.

v2: squashing multiple patches into this one
v3: add IRQ support for CIK, major cleanups,
    basic code documentation
v4: remove HAINAN from chipset list
Signed-off-by: NChristian König <christian.koenig@amd.com>

d93f7937

21 1月, 2014 1 次提交

drm/radeon: add query to fetch the max engine clock (v2) · f5f1f897

由 Alex Deucher 提交于 1月 20, 2014

This is needed for reporting the max GPU engine clock
in OpenCL.  This just reports the max possible engine
clock, it does not take into account current conditions
that may limit that clock.

v2: fix query number for merge with 3.13
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f5f1f897

23 12月, 2013 1 次提交

drm/radeon: expose render backend mask to the userspace · 439a1cff

由 Marek Olšák 提交于 12月 22, 2013

This will allow userspace to correctly program the PA_SC_RASTER_CONFIG
register, so it can be considered a fix.
Signed-off-by: NMarek Olšák <marek.olsak@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

439a1cff

18 11月, 2013 1 次提交

drm/radeon/cik: Add macrotile mode array query · 32f79a8a

由 Michel Dänzer 提交于 11月 18, 2013

This is required to properly calculate the tiling parameters
in userspace.
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

32f79a8a

21 9月, 2013 1 次提交

drm/radeon/cik: Add tiling mode index for 1D tiled depth/stencil surfaces · 42baf21d

由 Michel Dänzer 提交于 9月 18, 2013

CIK uses a different index for 1D DST surfaces compared to SI.  Expose
the new index so libdrm_radeon can use it properly for userspace
drivers.
Signed-off-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

42baf21d

31 8月, 2013 1 次提交

drm/radeon/si: Add support for CP DMA to CS checker for compute v2 · e5b9e750

由 Tom Stellard 提交于 8月 16, 2013

Also add a new RADEON_INFO query to check that CP DMA packets are
supported on the compute ring.

CP DMA has been supported since the 3.8 kernel, but due to an oversight
we forgot to teach the CS checker that the CP DMA packet was legal for
the compute ring on Southern Islands GPUs.

This patch fixes a bug where the radeon driver will incorrectly reject a legal
CP DMA packet from user space.  I would like to have the patch
backported to stable so that we don't have to require Mesa users to use a
bleeding edge kernel in order to take advantage of this feature which
is already present in the stable kernels (3.8 and newer).

v2:
  - Don't bump kms version, so this patch can be backported to stable
    kernels.

Cc: stable@vger.kernel.org
Signed-off-by: NTom Stellard <thomas.stellard@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

e5b9e750

11 4月, 2013 2 次提交

drm/radeon: add si tile mode array query v3 · 64d7b8be

由 Jerome Glisse 提交于 4月 09, 2013

Allow userspace to query for the tile mode array so userspace can properly
compute surface pitch and alignment requirement depending on tiling.

v2: Make strict aliasing safer by casting to char when copying
v3: merge fix from Christian
Signed-off-by: NJerome Glisse <jglisse@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

64d7b8be

drm/radeon: add ring working query · 902aaef6

由 Christian König 提交于 4月 09, 2013

Add new ioctl option and bumb minor version number.
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJerome Glisse <jglisse@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

902aaef6

09 4月, 2013 2 次提交

drm/radeon: UVD bringup v8 · f2ba57b5

由 Christian König 提交于 4月 08, 2013

Just everything needed to decode videos using UVD.

v6: just all the bugfixes and support for R7xx-SI merged in one patch
v7: UVD_CGC_GATE is a write only register, lockup detection fix
v8: split out VRAM fallback changes, remove support for RV770,
    add support for HEMLOCK, add buffer sizes checks
Signed-off-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NJerome Glisse <jglisse@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

f2ba57b5

drm/radeon: Use direct mapping for fast fb access on RS690 · a0a53aa8

由 Samuel Li 提交于 4月 08, 2013

This patch allows the CPU to map the stolen vram segment
directly rather than going through the PCI BAR.  This
significantly improves performance for certain workloads with
a properly patched ddx.

Use radeon.fastfb=1 to enable it (disabled by default).
Currently only supported on RS690, but support for RS780/880
and newer APUs may be added eventually.
Signed-off-by: NSamuel Li <samuel.li@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

a0a53aa8

14 12月, 2012 1 次提交

drm/radeon: enable the async DMA rings in the CS ioctl · 278a334c

由 Alex Deucher 提交于 12月 13, 2012

This enables the functionality added in the previous
patches.  Userspace acceleration drivers can use the
CS ioctl to submit command buffers to the async DMA
rings.
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

278a334c

08 12月, 2012 2 次提交

drm/radeon: add new INFO ioctl requests · 2e1a7674

由 Alex Deucher 提交于 12月 04, 2012

Add requests to get the number of shader engines (SE) and
the number of SH per SE.  These are needed for geometry
and tesselation shaders in the 3D driver as well as setting
up PA_SC_RASTER_CONFIG on SI asics.
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

2e1a7674

drm/radeon: add a CS flag END_OF_FRAME · 57f57083

由 Marek Olšák 提交于 12月 02, 2012

No version bump is required because setting the flag on older DRM has
no effect.

This only reserves the bit and doesn't use it. I assume we will use it
for buffer eviction heuristics.
Signed-off-by: NMarek Olšák <maraeo@gmail.com>

57f57083

05 10月, 2012 1 次提交

UAPI: (Scripted) Disintegrate include/drm · 718dcedd

由 David Howells 提交于 10月 04, 2012

Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NMichael Kerrisk <mtk.manpages@gmail.com>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NDave Jones <davej@redhat.com>

718dcedd

03 10月, 2012 1 次提交

UAPI: (Scripted) Convert #include "..." to #include <path/...> in kernel system headers · a1ce3928

由 David Howells 提交于 10月 02, 2012

Convert #include "..." to #include <path/...> in kernel system headers.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NThomas Gleixner <tglx@linutronix.de>
Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: NDave Jones <davej@redhat.com>

a1ce3928

13 8月, 2012 1 次提交

drm/radeon/kms: implement timestamp userspace query (v2) · 6759a0a7

由 Marek Olšák 提交于 8月 09, 2012

Returns a snapshot of the GPU clock counter.  Needed
for certain OpenGL extensions.

v2: agd5f
- address Jerome's comments
- add function documentation
Signed-off-by: NMarek Olšák <maraeo@gmail.com>
Reviewed-by: NJerome Glisse <jglisse@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6759a0a7

10 5月, 2012 1 次提交

drm/radeon: fix possible lack of synchronization btw ttm and other ring · 133f4cb3

由 Jerome Glisse 提交于 5月 09, 2012

We need to sync with the GFX ring as ttm might have schedule bo move
on it and new command scheduled for other ring need to wait for bo
data to be in place.
Signed-off-by: NJerome Glisse <jglisse@redhat.com>
Reviewed by: Christian König <christian.koenig@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

133f4cb3

21 3月, 2012 2 次提交

drm/radeon/kms: add support for the CONST IB to the CS ioctl · dfcf5f36

由 Alex Deucher 提交于 3月 20, 2012

This adds a new chunk id to the CS ioctl to support the
INDIRECT_BUFFER_CONST packet.

On SI, the CP adds a new engine called the CE (Constant Engine)
which runs simulatenously with the DE (Drawing Engine, formerly
called the ME).  This allows the CP to process two related IBs
simultaneously.  The CE is tasked with loading the constant data
(constant buffers, resource descriptors, samplers, etc.) while
the DE loads context register state and issues drawing commands.
It's up to the userspace application to sychronize the CE and the
DE using special synchronization packets.
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

dfcf5f36

drm/radeon/kms: add info query for max pipes · 609c1e15

由 Tom Stellard 提交于 3月 20, 2012

The maximum number of pipes is needed by the user space compute
driver to calculate the number of wavefronts per thread group.
Signed-off-by: NTom Stellard <thomas.stellard@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

609c1e15

13 2月, 2012 1 次提交

drm/radeon: add support for evergreen/ni tiling informations v11 · 285484e2

由 Jerome Glisse 提交于 12月 16, 2011

evergreen and northern island gpu needs more informations for 2D tiling
than previous r6xx/r7xx. Add field to tiling ioctl to allow userspace
to provide those.

The v8 cs checking change to track color view on r6xx/r7xx doesn't
affect old userspace as old userspace always emited 0 for this register.

v2 fix r6xx/r7xx 2D tiling computation
v3 fix r6xx/r7xx height align for untiled surface & add support for
   tile split on evergreen and newer
v4 improve tiling debugging output
v5 fix tile split code for evergreen and newer
v6 set proper tile split for crtc register
v7 fix tile split limit value
v8 add COLOR_VIEW checking to r6xx/r7xx checker, add evergreen cs
   checking, update safe reg for r600, evergreen and cayman.
   Evergreen checking need some work around for stencil alignment
   issues
v9 fix tile split value range, fix compressed texture handling and
   mipmap calculation, allow evergreen check to be silencious in
   front of current broken userspace (depth/stencil alignment issue)
v10 fix eg 3d texture and compressed texture, fix r600 depth array,
    fix r600 color view computation, add support for evergreen stencil
    split
v11 more verbose debugging in some case
Signed-off-by: NJerome Glisse <jglisse@redhat.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

285484e2

09 1月, 2012 1 次提交

drm/radeon/kms: remove pointless CS flags priority struct · f0afb5d4

由 Alex Deucher 提交于 1月 06, 2012

Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: Christian König <deathsimple@vodafone.de>
Signed-off-by: NDave Airlie <airlied@redhat.com>

f0afb5d4

06 1月, 2012 2 次提交

drm/radeon/kms: Add support for multi-ring sync in CS ioctl (v2) · 93504fce

由 Christian König 提交于 1月 05, 2012

Use semaphores to sync buffers across rings in the CS
ioctl.  Add a reloc flag to allow userspace to skip
sync for buffers.

agd5f: port to latest CS ioctl changes.

v2: add ring lock/unlock to make sure changes hit the ring.
Signed-off-by: NChristian König <deathsimple@vodafone.de>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

93504fce

drm/radeon: GPU virtual memory support v22 · 721604a1

由 Jerome Glisse 提交于 1月 05, 2012

Virtual address space are per drm client (opener of /dev/drm).
Client are in charge of virtual address space, they need to
map bo into it by calling DRM_RADEON_GEM_VA ioctl.

First 16M of virtual address space is reserved by the kernel.

Once using 2 level page table we should be able to have a small
vram memory footprint for each pt (there would be one pt for all
gart, one for all vram and then one first level for each virtual
address space).

Plan include using the sub allocator for a common vm page table
area and using memcpy to copy vm page table in & out. Or use
a gart object and copy things in & out using dma.

v2: agd5f fixes:
- Add vram base offset for vram pages.  The GPU physical address of a
vram page is FB_OFFSET + page offset.  FB_OFFSET is 0 on discrete
cards and the physical bus address of the stolen memory on
integrated chips.
- VM_CONTEXT1_PROTECTION_FAULT_DEFAULT_ADDR covers all vmid's >= 1

v3: agd5f:
- integrate with the semaphore/multi-ring stuff

v4:
- rebase on top ttm dma & multi-ring stuff
- userspace is now in charge of the address space
- no more specific cs vm ioctl, instead cs ioctl has a new
  chunk

v5:
- properly handle mem == NULL case from move_notify callback
- fix the vm cleanup path

v6:
- fix update of page table to only happen on valid mem placement

v7:
- add tlb flush for each vm context
- add flags to define mapping property (readable, writeable, snooped)
- make ring id implicit from ib->fence->ring, up to each asic callback
  to then do ring specific scheduling if vm ib scheduling function

v8:
- add query for ib limit and kernel reserved virtual space
- rename vm->size to max_pfn (maximum number of page)
- update gem_va ioctl to also allow unmap operation
- bump kernel version to allow userspace to query for vm support

v9:
- rebuild page table only when bind and incrementaly depending
  on bo referenced by cs and that have been moved
- allow virtual address space to grow
- use sa allocator for vram page table
- return invalid when querying vm limit on non cayman GPU
- dump vm fault register on lockup

v10: agd5f:
- Move the vm schedule_ib callback to a standalone function, remove
  the callback and use the existing ib_execute callback for VM IBs.

v11:
- rebase on top of lastest Linus

v12: agd5f:
- remove spurious backslash
- set IB vm_id to 0 in radeon_ib_get()

v13: agd5f:
- fix handling of RADEON_CHUNK_ID_FLAGS

v14:
- fix va destruction
- fix suspend resume
- forbid bo to have several different va in same vm

v15:
- rebase

v16:
- cleanup left over of vm init/fini

v17: agd5f:
- cs checker

v18: agd5f:
- reworks the CS ioctl to better support multiple rings and
VM.  Rather than adding a new chunk id for VM, just re-use the
IB chunk id and add a new flags for VM mode.  Also define additional
dwords for the flags chunk id to define the what ring we want to use
(gfx, compute, uvd, etc.) and the priority.

v19:
- fix cs fini in weird case of no ib
- semi working flush fix for ni
- rebase on top of sa allocator changes

v20: agd5f:
- further CS ioctl cleanups from Christian's comments

v21: agd5f:
- integrate CS checker improvements

v22: agd5f:
- final cleanups for release, only allow VM CS on cayman
Signed-off-by: NJerome Glisse <jglisse@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

721604a1

20 11月, 2011 1 次提交

drm/radeon/kms: add a CS ioctl flag not to rewrite tiling flags in the CS · e70f224c

由 Marek Olšák 提交于 10月 25, 2011

This adds a new optional chunk to the CS ioctl that specifies optional flags
to the CS parser. Why this is useful is explained below. Note that some regs
no longer need the NOP relocation packet if this feature is enabled.
Tested on r300g and r600g with this flag disabled and enabled.

Assume there are two contexts sharing the same mipmapped tiled texture.
One context wants to render into the first mipmap and the other one
wants to render into the last mipmap. As you probably know, the hardware
has a MACRO_SWITCH feature, which turns off macro tiling for small mipmaps,
but that only applies to samplers.
(at least on r300-r500, though later hardware likely behaves the same)

So we want to just re-set the tiling flags before rendering (writing
packets), right? ... No. The contexts run in parallel, so they may
set the tiling flags simultaneously and then fire their command streams
also simultaneously. The last one setting the flags wins, the other one
loses.

Another problem is when one context wants to render into the first and
the last mipmap in one CS. Impossible. It must flush before changing
tiling flags and do the rendering into the smaller mipmaps in another CS.

Yet another problem is that writing copy_blit in userspace would be a mess
involving re-setting tiling flags to please the kernel, and causing races
with other contexts at the same time.

The only way out of this is to send tiling flags with each CS, ideally
with each relocation. But we already do that through the registers.
So let's just use what we have in the registers.
Signed-off-by: NMarek Olšák <maraeo@gmail.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

e70f224c