提交 · d0164adc89f6bb374d304ffcc375c6d2652fe67d · openeuler / raspberrypi-kernel

07 11月, 2015 1 次提交

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc

由 Mel Gorman 提交于 11月 06, 2015

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd

__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts.  They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve".  __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available.  Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative.  High priority users continue to use
__GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
  pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
  __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
  into this category where kswapd will still be woken but atomic reserves
  are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
  helper gfpflags_allow_blocking() where possible. This is because
  checking for __GFP_WAIT as was done historically now can trigger false
  positives. Some exceptions like dm-crypt.c exist where the code intent
  is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
  flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
  and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL.  They may
now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.
Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0164adc

28 7月, 2015 1 次提交

drm/i915: Keep the mm.bound_list in rough LRU order · 6c246959

由 Chris Wilson 提交于 7月 27, 2015

When we shrink our working sets, we want to avoid stealing pages from
objects that likely to be reused in the near future. We first look at
inactive objects before processing active objects - but what about a
recently active object that is about to be used again. That object's
position in the bound_list is ordered by the time of binding, not the
time of last use, so the most recently used inactive object could well
be at the head of the shrink list. To compensate, give the object a bump
to MRU when it becomes inactive (thus transitioning to the end of the
first pass in shrink lists). Conversely, bumping on inactive makes
bumping on active useless, since when we do have to reap from the active
working set, everything is going to become inactive very quickly and the
order pretty much random - just hope for the best at that point, as once
we start stalling on active objects, we can hope that the rebinding
neatly orders vital objects.
Suggested-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
[danvet: Resolve merge conflict.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6c246959

27 7月, 2015 2 次提交

atomic: Replace atomic_{set,clear}_mask() usage · 805de8f4

由 Peter Zijlstra 提交于 4月 24, 2015

Replace the deprecated atomic_{set,clear}_mask() usage with the now
ubiquous atomic_{or,andnot}() functions.
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

805de8f4

drm/i915: Extract i915_gem_fence.c · 41a36b73

由 Daniel Vetter 提交于 7月 24, 2015

No code changes, just moving all the fence related code into a
separate file (and avoiding a bunch of forward declarations while at
it).
Acked-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>

41a36b73

21 7月, 2015 1 次提交

drm/i915: Add i915_gem_object_create_from_data() · ea70299d

由 Dave Gordon 提交于 7月 09, 2015

i915_gem_object_create_from_data() is a generic function to save data
from a plain linear buffer in a new pageable gem object that can later
be accessed by the CPU and/or GPU.

We will need this for the microcontroller firmware loading support code.

Derived from i915_gem_object_write(), originally by Alex Dai

v2:
    Change of function: now allocates & fills a new object, rather than
        writing to an existing object
    New name courtesy of Chris Wilson
    Explicit domain-setting and other improvements per review comments
        by Chris Wilson & Daniel Vetter

v4:
    Rebased

Issue: VIZ-4884
Signed-off-by: NAlex Dai <yu.dai@intel.com>
Signed-off-by: NDave Gordon <david.s.gordon@intel.com>
Reviewed-by: NTom O'Rourke <Tom.O'Rourke@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ea70299d

14 7月, 2015 3 次提交

drm/i915: remove unused has_dma_mapping flag · 5ec5b516

由 Imre Deak 提交于 7月 08, 2015

After the previous patch this flag will check always clear, as it's
never set for shmem backed and userptr objects, so we can remove it.
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
[danvet: Yeah this isn't really fixes but it's a nice cleanup to
clarify the code but not really worth the hassle of backmerging. So
just add to -fixes, we're still early in -rc.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5ec5b516

drm/i915: avoid leaking DMA mappings · e2273302

由 Imre Deak 提交于 7月 09, 2015

We have 3 types of DMA mappings for GEM objects:
1. physically contiguous for stolen and for objects needing contiguous
   memory
2. DMA-buf mappings imported via a DMA-buf attach operation
3. SG DMA mappings for shmem backed and userptr objects

For 1. and 2. the lifetime of the DMA mapping matches the lifetime of the
corresponding backing pages and so in practice we create/release the
mapping in the object's get_pages/put_pages callback.

For 3. the lifetime of the mapping matches that of any existing GPU binding
of the object, so we'll create the mapping when the object is bound to
the first vma and release the mapping when the object is unbound from its
last vma.

Since the object can be bound to multiple vmas, we can end up creating a
new DMA mapping in the 3. case even if the object already had one. This
is not allowed by the DMA API and can lead to leaked mapping data and
IOMMU memory space starvation in certain cases. For example HW IOMMU
drivers (intel_iommu) allocate a new range from their memory space
whenever a mapping is created, silently overriding a pre-existing
mapping.

Fix this by moving the creation/removal of DMA mappings to the object's
get_pages/put_pages callbacks. These callbacks already check for and do
an early return in case of any nested calls. This way objects of the 3.
case also become more like the other object types.

I noticed this issue by enabling DMA debugging, which got disabled after
a while due to its internal mapping tables getting full. It also reported
errors in connection to random other drivers that did a DMA mapping for
an address that was previously mapped by i915 but was never released.
Besides these diagnostic messages and the memory space starvation
problem for IOMMUs, I'm not aware of this causing a real issue.

The fix is based on a patch from Chris.

v2:
- move the DMA mapping create/remove calls to the get_pages/put_pages
  callbacks instead of adding new callbacks for these (Chris)
v3:
- also fix the get_page cache logic on the userptr async path (Chris)
Signed-off-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

e2273302

drm/i915: Snapshot seqno of most recently submitted request. · 94f7bbe1

由 Tomas Elf 提交于 7月 09, 2015

The hang checker needs to inspect whether or not the ring request list is empty
as well as if the given engine has reached or passed the most recently
submitted request. The problem with this is that the hang checker cannot grab
the struct_mutex, which is required in order to safely inspect requests since
requests might be deallocated during inspection. In the past we've had kernel
panics due to this very unsynchronized access in the hang checker.

One solution to this problem is to not inspect the requests directly since
we're only interested in the seqno of the most recently submitted request - not
the request itself. Instead the seqno of the most recently submitted request is
stored separately, which the hang checker then inspects, circumventing the
issue of synchronization from the hang checker entirely.

This fixes a regression introduced in

commit 44cdd6d2
Author: John Harrison <John.C.Harrison@Intel.com>
Date:   Mon Nov 24 18:49:40 2014 +0000

    drm/i915: Convert 'ring_idle()' to use requests not seqnos

v2 (Chris Wilson):
- Pass current engine seqno to ring_idle() from i915_hangcheck_elapsed() rather
than compute it over again.
- Remove extra whitespace.

Issue: VIZ-5998
Signed-off-by: NTomas Elf <tomas.elf@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
[danvet: Add regressing commit citation provided by Chris.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

94f7bbe1

08 7月, 2015 1 次提交

drm/i915: Add origin to frontbuffer tracking flush · de152b62

由 Rodrigo Vivi 提交于 7月 07, 2015

This will be useful to PSR and FBC once we start making
dirty fb calls to also flush frontbuffer.

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

de152b62

06 7月, 2015 2 次提交

drm/i915: Convert intel_lr_context_pin() for requests · 8ba319da

由 Mika Kuoppala 提交于 7月 03, 2015

Pass around requests to carry context deeper in callchain.
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

8ba319da

drm/i915: Also perform gpu reset under execlist mode. · a647828a

由 Niu,Bing 提交于 7月 04, 2015

It is found that i915 will not reset gpu under execlist mode when
unload module. that will lead to some issues when unload/load module
with different submission mode. e.g. from execlist mode to ring
buffer mode via loading/unloading i915. Because HW is not in a reset
state and registers are not clean under such condition.
Signed-off-by: NNiu,Bing <bing.niu@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a647828a

03 7月, 2015 1 次提交

drm/i915: Report correct GGTT space usage · ca1543be

由 Tvrtko Ursulin 提交于 7月 01, 2015

Currently only normal views were accounted which under-accounts
the usage as reported in debugfs.

Introduce new helper, i915_gem_obj_total_ggtt_size, and use it
from call sites which want to know how much GGTT space are
objects using.

v2: Single loop in i915_gem_get_aperture_ioctl. (Chris Wilson)

v3: Walk GGTT active/inactive lists in i915_gem_get_aperture_ioctl
    for better efficiency. (Chris Wilson, Daniel Vetter)

v4: Make i915_gem_obj_total_ggtt_size private to debugfs. (Chris Wilson)

v5: Change unsigned long to u64. (Chris Wilson)
Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ca1543be

29 6月, 2015 1 次提交

drm/i915: Unconditionally do fb tracking invalidate in set_domain · 031b698a

由 Daniel Vetter 提交于 6月 26, 2015

We can't elide the fb tracking invalidate if the buffer is already in
the right domain since that would lead to missed screen updates. I'm
pretty sure I've written this already before but must have gotten lost
unfortunately :(

v2: Chris observed that all internal set_domain users already
correctly do the fb invalidate on their own, hence we can move this
just into the set_domain ioctl instead.

v3: I screwed up setting the invalidate ORIGIN_* correctly (Chris).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk>
Tested-by: NPaulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

031b698a

26 6月, 2015 1 次提交

drm/i915/gtt: Allow >= 4GB sizes for vm. · c44ef60e

由 Mika Kuoppala 提交于 6月 25, 2015

We can have exactly 4GB sized ppgtt with 32bit system.
size_t is inadequate for this.

v2: Convert a lot more places (Daniel)
Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: NMichel Thierry <michel.thierry@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c44ef60e

23 6月, 2015 24 次提交

drm/i915: Remove the now obsolete 'i915_gem_check_olr()' · a5ac0f90

由 John Harrison 提交于 5月 29, 2015

As there is no OLR to check, the check_olr() function is now a no-op and can be
removed.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

a5ac0f90

drm/i915: Move the request/file and request/pid association to creation time · fcfa423c

由 John Harrison 提交于 5月 29, 2015

In _i915_add_request(), the request is associated with a userland client.
Specifically it is linked to the 'file' structure and the current user process
is recorded. One problem here is that the current user process is not
necessarily the same as when the request was submitted to the driver. This is
especially true when the GPU scheduler arrives and decouples driver submission
from hardware submission. Note also that it is only in the case where the add
request comes from an execbuff call that there is a client to associate. Any
other add request call is kernel only so does not need to do it.

This patch moves the client association into a separate function. This is then
called from the execbuffer code path itself at a sensible time. It also removes
the now redundant 'file' pointer from the add request parameter list.

An extra cleanup of the client association is also added to the request clean up
code for the eventuality where the request is killed after association but
before being submitted (e.g. due to out of memory error somewhere). Once the
submission has happened, the request is on the request list and the regular
request list removal will clear the association. Note that this still needs to
happen at this point in time because the request might be kept floating around
much longer (due to someone holding a reference count) and the client should not
be worrying about this request after it has been retired.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

fcfa423c

drm/i915: Remove the now obsolete 'outstanding_lazy_request' · bccca494

由 John Harrison 提交于 5月 29, 2015

The outstanding_lazy_request is no longer used anywhere in the driver.
Everything that was looking at it now has a request explicitly passed in from on
high. Everything that was relying upon it behind the scenes is now explicitly
creating/passing/submitting its own private request. Thus the OLR can be
removed.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

bccca494

drm/i915: Add *_ring_begin() to request allocation · ccd98fe4

由 John Harrison 提交于 5月 29, 2015

Now that the *_ring_begin() functions no longer call the request allocation
code, it is finally safe for the request allocation code to call *_ring_begin().
This is important to guarantee that the space reserved for the subsequent
i915_add_request() call does actually get reserved.

v2: Renamed functions according to review feedback (Tomas Elf).

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ccd98fe4

drm/i915: Update intel_ring_begin() to take a request structure · 5fb9de1a

由 John Harrison 提交于 5月 29, 2015

Now that everything above has been converted to use requests, intel_ring_begin()
can be updated to take a request instead of a ring. This also means that it no
longer needs to lazily allocate a request if no-one happens to have done it
earlier.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5fb9de1a

drm/i915: Update ring->sync_to() to take a request structure · 599d924c

由 John Harrison 提交于 5月 29, 2015

Updated the ring->sync_to() implementations to take a request instead of a ring.
Also updated the tracer to include the request id.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
[danvet: Rebase since I didn't merge the patch which added ->uniq.]
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

599d924c

drm/i915: Update ring->emit_request() to take a request structure · c4e76638

由 John Harrison 提交于 5月 29, 2015

Updated the ring->emit_request() implementation to take a request instead of a
ringbuf/request pair. Also removed its use of the OLR for obtaining the
request's seqno.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

c4e76638

drm/i915: Update ring->add_request() to take a request structure · ee044a88

由 John Harrison 提交于 5月 29, 2015

Updated the various ring->add_request() implementations to take a request
instead of a ring. This removes their reliance on the OLR to obtain the seqno
value that the request should be tagged with.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ee044a88

drm/i915: Update flush_all_caches() to take request structures · 4866d729

由 John Harrison 提交于 5月 29, 2015

Updated the *_ring_flush_all_caches() functions to take requests instead of
rings or ringbuf/context pairs.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

4866d729

drm/i915: Update l3_remap to take a request structure · 6909a666

由 John Harrison 提交于 5月 29, 2015

Converted i915_gem_l3_remap() to take a request structure instead of a ring.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

6909a666

drm/i915: Update [vma|object]_move_to_active() to take request structures · b2af0376

由 John Harrison 提交于 5月 29, 2015

Now that everything above has been converted to use request structures, it is
possible to update the lower level move_to_active() functions to be request
based as well.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b2af0376

drm/i915: Update add_request() to take a request structure · 75289874

由 John Harrison 提交于 5月 29, 2015

Now that all callers of i915_add_request() have a request pointer to hand, it is
possible to update the add request function to take a request pointer rather
than pulling it out of the OLR.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

75289874

drm/i915: Update i915_gem_object_sync() to take a request structure · 91af127f

由 John Harrison 提交于 6月 18, 2015

The plan is to pass requests around as the basic submission tracking structure
rather than rings and contexts. This patch updates the i915_gem_object_sync()
code path.

v2: Much more complex patch to share a single request between the sync and the
page flip. The _sync() function now supports lazy allocation of the request
structure. That is, if one is passed in then that will be used. If one is not,
then a request will be allocated and passed back out. Note that the _sync() code
does not necessarily require a request. Thus one will only be created until
certain situations. The reason the lazy allocation must be done within the
_sync() code itself is because the decision to need one or not is not really
something that code above can second guess (except in the case where one is
definitely not required because no ring is passed in).

The call chains above _sync() now support passing a request through which most
callers passing in NULL and assuming that no request will be required (because
they also pass in NULL for the ring and therefore can't be generating any ring
code).

The exeception is intel_crtc_page_flip() which now supports having a request
returned from _sync(). If one is, then that request is shared by the page flip
(if the page flip is of a type to need a request). If _sync() does not generate
a request but the page flip does need one, then the page flip path will create
its own request.

v3: Updated comment description to be clearer about 'to_req' parameter (Tomas
Elf review request). Rebased onto newer tree that significantly changed the
synchronisation code.

v4: Updated comments from review feedback (Tomas Elf)

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

91af127f

drm/i915: Update i915_switch_context() to take a request structure · ba01cc93

由 John Harrison 提交于 5月 29, 2015

Now that the request is guaranteed to specify the context, it is possible to
update the context switch code to use requests rather than ring and context
pairs. This patch updates i915_switch_context() accordingly.

Also removed the warning that the request's context must match the last context
switch's context. As the context switch now gets the context object from the
request structure, there is no longer any scope for the two to become out of
step.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

ba01cc93

drm/i915: Update ppgtt_init_ring() & context_enable() to take requests · b3dd6b96

由 John Harrison 提交于 5月 29, 2015

The final step in removing the OLR from i915_gem_init_hw() is to pass the newly
allocated request structure in to each step rather than passing a ring
structure. This patch updates both i915_ppgtt_init_ring() and
i915_gem_context_enable() to take request pointers.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

b3dd6b96

drm/i915: Add explicit request management to i915_gem_init_hw() · dc4be607

由 John Harrison 提交于 5月 29, 2015

Now that a single per ring loop is being done for all the different
intialisation steps in i915_gem_init_hw(), it is possible to add proper request
management as well. The last remaining issue is that the context enable call
eventually ends up within *_render_state_init() and this does its own private
_i915_add_request() call.

This patch adds explicit request creation and submission to the top level loop
and removes the add_request() from deep within the sub-functions.

v2: Updated for removal of batch_obj from add_request call in previous patch.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

dc4be607

drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable() · 90638cc1

由 John Harrison 提交于 5月 29, 2015

The start of day context initialisation code in i915_gem_context_enable() loops
over each ring and calls the legacy switch context or the execlist init context
code as appropriate.

This patch moves the ring looping out of that function in to the top level
caller i915_gem_init_hw(). This means the a single pass can be made over all
rings doing the PPGTT, L3 remap and context initialisation of each ring
altogether.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

90638cc1

drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring · 4ad2fd88

由 John Harrison 提交于 6月 18, 2015

The i915_gem_init_hw() function calls a bunch of smaller initialisation
functions. Multiple of which have generic sections and per ring sections. This
means multiple passes are done over the rings. Each pass writes data to the ring
which floats around in that ring's OLR until some random point in the future
when an add_request() is done by some random other piece of code.

This patch breaks i915_ppgtt_init_hw() in two with the per ring initialisation
now being done in i915_ppgtt_init_ring(). The ring looping is now done at the
top level in i915_gem_init_hw().

v2: Fix dumb loop variable re-use.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com> (v1)
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

4ad2fd88

drm/i915: Update i915_gpu_idle() to manage its own request · 73cfa865

由 John Harrison 提交于 5月 29, 2015

Added explicit request creation and submission to the GPU idle code path.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

73cfa865

drm/i915: Add flag to i915_add_request() to skip the cache flush · 5b4a60c2

由 John Harrison 提交于 5月 29, 2015

In order to explcitly track all GPU work (and completely remove the outstanding
lazy request), it is necessary to add extra i915_add_request() calls to various
places. Some of these do not need the implicit cache flush done as part of the
standard batch buffer submission process.

This patch adds a flag to _add_request() to specify whether the flush is
required or not.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

5b4a60c2

drm/i915: Update alloc_request to return the allocated request · 217e46b5

由 John Harrison 提交于 5月 29, 2015

The alloc_request() function does not actually return the newly allocated
request. Instead, it must be pulled from ring->outstanding_lazy_request. This
patch fixes this so that code can create a request and start using it knowing
exactly which request it actually owns.

v2: Updated for new i915_gem_request_alloc() scheme.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

217e46b5

drm/i915: Set context in request from creation even in legacy mode · 40e895ce

由 John Harrison 提交于 5月 29, 2015

In execlist mode, the context object pointer is written in to the request
structure (and reference counted) at the point of request creation. In legacy
mode, this only happens inside i915_add_request().

This patch updates the legacy code path to match the execlist version. This
allows all the intermediate code between request creation and request submission
to get at the context object given only a request structure. Thus negating the
need to pass context pointers here, there and everywhere.

v2: Moved the context reference so it does not need to be undone if the
get_seqno() fails.

v3: Fixed execlist mode always hitting a warning about invalid last_contexts
(which don't exist in execlist mode).

v4: Updated for new i915_gem_request_alloc() scheme.

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

40e895ce

drm/i915: i915_add_request must not fail · bf7dc5b7

由 John Harrison 提交于 5月 29, 2015

The i915_add_request() function is called to keep track of work that has been
written to the ring buffer. It adds epilogue commands to track progress (seqno
updates and such), moves the request structure onto the right list and other
such house keeping tasks. However, the work itself has already been written to
the ring and will get executed whether or not the add request call succeeds. So
no matter what goes wrong, there isn't a whole lot of point in failing the call.

At the moment, this is fine(ish). If the add request does bail early on and not
do the housekeeping, the request will still float around in the
ring->outstanding_lazy_request field and be picked up next time. It means
multiple pieces of work will be tagged as the same request and driver can't
actually wait for the first piece of work until something else has been
submitted. But it all sort of hangs together.

This patch series is all about removing the OLR and guaranteeing that each piece
of work gets its own personal request. That means that there is no more
'hoovering up of forgotten requests'. If the request does not get tracked then
it will be leaked. Thus the add request call _must_ not fail. The previous patch
should have already ensured that it _will_ not fail by removing the potential
for running out of ring space. This patch enforces the rule by actually removing
the early exit paths and the return code.

Note that if something does manage to fail and the epilogue commands don't get
written to the ring, the driver will still hang together. The request will be
added to the tracking lists. And as in the old case, any subsequent work will
generate a new seqno which will suffice for marking the old one as complete.

v2: Improved WARNings (Tomas Elf review request).

For: VIZ-5115
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Reviewed-by: NTomas Elf <tomas.elf@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

bf7dc5b7

drm/i915: Reserve ring buffer space for i915_add_request() commands · 29b1b415

由 John Harrison 提交于 6月 18, 2015

It is a bad idea for i915_add_request() to fail. The work will already have been
send to the ring and will be processed, but there will not be any tracking or
management of that work.

The only way the add request call can fail is if it can't write its epilogue
commands to the ring (cache flushing, seqno updates, interrupt signalling). The
reasons for that are mostly down to running out of ring buffer space and the
problems associated with trying to get some more. This patch prevents that
situation from happening in the first place.

When a request is created, it marks sufficient space as reserved for the
epilogue commands. Thus guaranteeing that by the time the epilogue is written,
there will be plenty of space for it. Note that a ring_begin() call is required
to actually reserve the space (and do any potential waiting). However, that is
not currently done at request creation time. This is because the ring_begin()
code can allocate a request. Hence calling begin() from the request allocation
code would lead to infinite recursion! Later patches in this series remove the
need for begin() to do the allocate. At that point, it becomes safe for the
allocate to call begin() and really reserve the space.

Until then, there is a potential for insufficient space to be available at the
point of calling i915_add_request(). However, that would only be in the case
where the request was created and immediately submitted without ever calling
ring_begin() and adding any work to that request. Which should never happen. And
even if it does, and if that request happens to fall down the tiny window of
opportunity for failing due to being out of ring space then does it really
matter because the request wasn't doing anything in the first place?

v2: Updated the 'reserved space too small' warning to include the offending
sizes. Added a 'cancel' operation to clean up when a request is abandoned. Added
re-initialisation of tracking state after a buffer wrap to keep the sanity
checks accurate.

v3: Incremented the reserved size to accommodate Ironlake (after finally
managing to run on an ILK system). Also fixed missing wrap code in LRC mode.

v4: Added extra comment and removed duplicate WARN (feedback from Tomas).

For: VIZ-5115
CC: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: NJohn Harrison <John.C.Harrison@Intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

29b1b415

22 6月, 2015 1 次提交

drm/i915: Remove unused ring argument from frontbuffer invalidate and busy functions. · 77a0d1ca

由 Rodrigo Vivi 提交于 6月 18, 2015

This patch doesn't have any functional change, but organize fruntbuffer
invalidate and busy by removing unecesarry signature argument for ring.

It was unsed on mark_fb_busy and only used on fb_obj_invalidate for the
same ORIGIN_CS usage. So let's clean it a bit

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>

77a0d1ca

15 6月, 2015 1 次提交

Revert "drm/i915: Don't skip request retirement if the active list is empty" · 245ec9d8

由 Jani Nikula 提交于 6月 15, 2015

This reverts commit 0aedb162.

I messed things up while applying [1] to drm-intel-fixes. Rectify.

[1] http://mid.gmane.org/1432827156-9605-1-git-send-email-ville.syrjala@linux.intel.com

Fixes: 0aedb162 ("drm/i915: Don't skip request retirement if the active list is empty")
Cc: stable@vger.kernel.org
Acked-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

245ec9d8