提交 · 6f6e0b217a93011f8e11b9a2d5521a08fcf36990 · openeuler / Kernel

11 7月, 2017 1 次提交

drm/rockchip: fix NULL check on devm_kzalloc() return value · 6f6e0b21

由 Gustavo A. R. Silva 提交于 7月 06, 2017

The right variable to check here is port, not dp.

This issue was detected using Coccinelle and the following semantic patch:

@@
expression x;
identifier fld;
@@

* x = devm_kzalloc(...);
  ... when != x == NULL
  x->fld
Signed-off-by: NGustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: NMark Yao <mark.yao@rock-chips.com>
Signed-off-by: NSean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170706215833.GA25411@embeddedgus

6f6e0b21

03 7月, 2017 1 次提交

drm/atomic: Add missing drm_atomic_state_clear to atomic_remove_fb · 4086d90c

由 Maarten Lankhorst 提交于 6月 29, 2017

All atomic state should be cleared when drm_modeset_backoff() is
called, because it drops all locks and the state becomes invalid.

The call to drm_atomic_state_clear was missing in atomic_remove_fb,
so add the missing call there.
Signed-off-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170629115954.26029-1-maarten.lankhorst@linux.intel.comReviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Fixes: db8f6403 ("drm: Convert drm_framebuffer_remove to atomic, v4.")
Cc: stable@vger.kernel.org # v4.12-rc1+

4086d90c

29 6月, 2017 1 次提交

drm: vblank: Fix vblank timestamp update · 138b87fa

由 Laurent Pinchart 提交于 6月 29, 2017

Commit 3fcdcb27 ("drm/vblank: Switch to bool in_vblank_irq in
get_vblank_timestamp") inverted a condition by mistake that resulted in
vblank timestamps always being 0 on hardware without a vblank counter.
Fix it.

Fixes: 3fcdcb27 ("drm/vblank: Switch to bool in_vblank_irq in get_vblank_timestamp")
Suggested-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170629123720.27173-1-laurent.pinchart+renesas@ideasonboard.com

138b87fa

21 6月, 2017 10 次提交

bridge: Fix panel-bridge error return on !panel. · e6f0acb2

由 Eric Anholt 提交于 6月 15, 2017

ERR_PTR() needs a negative errno argument.
Signed-off-by: NEric Anholt <eric@anholt.net>
Signed-off-by: NArchit Taneja <architt@codeaurora.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615175423.17954-1-eric@anholt.net

e6f0acb2

drm/arm: hdlcd: remove unused variables · fee4964f

由 Arnd Bergmann 提交于 6月 20, 2017

The last rework left behind two unused variables:

drm/arm/hdlcd_crtc.c: In function 'hdlcd_plane_atomic_update':
drm/arm/hdlcd_crtc.c:264:13: warning: unused variable 'src_y' [-Wunused-variable]
drm/arm/hdlcd_crtc.c:264:6: warning: unused variable 'src_x' [-Wunused-variable]

This removes them.

Fixes: b2ae06ae9834 ("drm/arm: hdlcd: Use CMA helper for plane buffer address calculation")
Signed-off-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NLiviu Dudau <liviu.dudau@arm.com>
Signed-off-by: NLiviu Dudau <liviu.dudau@arm.com>

fee4964f

drm/arm: hdlcd: Use CMA helper for plane buffer address calculation · 49a58f26

由 Liviu Dudau 提交于 6月 13, 2017

CMA has gained a recent helper function for calculating the start
of the plane buffer's physical address. Use that instead of the
hand rolled version.
Signed-off-by: NLiviu Dudau <liviu.dudau@arm.com>

49a58f26

drm/arm: hdlcd: Set the CRTC's port before binding the encoder. · de5cc815

由 Liviu Dudau 提交于 6月 06, 2017

The component-based encoder(s) used by HDLCD expect the CRTC port
to be set before binding in order to find the right endpoint.
Without this patch, the TDA19988 encoder driver prints a warning
"Falling back to first CRTC".
Signed-off-by: NLiviu Dudau <Liviu.Dudau@arm.com>

de5cc815

drm: Fix GETCONNECTOR regression · e94ac351

由 Daniel Vetter 提交于 6月 20, 2017

In

commit 91eefc05
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Dec 14 00:08:10 2016 +0100

    drm: Tighten locking in drm_mode_getconnector

I reordered the logic a bit in that IOCTL, but that broke userspace
since it'll get the new mode list, but not the new property values.
Fix that again.

v2: Fix up the error path handling when copy_to_user for the modes
failes (Dhinakaran).

Fixes: 91eefc05 ("drm: Tighten locking in drm_mode_getconnector")
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Reported-by: N"H.J. Lu" <hjl.tools@gmail.com>
Tested-by: N"H.J. Lu" <hjl.tools@gmail.com>
Cc: <stable@vger.kernel.org> # v4.11+
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100576
Cc: "H.J. Lu" <hjl.tools@gmail.com>
Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com>
Reviewed-by: NSean Paul <seanpaul@chromium.org>
Reviewed-by: NDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170620202837.1701-1-daniel.vetter@ffwll.ch

e94ac351

D
drm/i915: remove rate_to_index, messed up merge. · 296923e1
由 Dave Airlie 提交于 6月 21, 2017
```
This was from a merge I did incorrectly.
Signed-off-by: NDave Airlie <airlied@redhat.com>
```
296923e1

drm/radeon: add a quirk for Toshiba Satellite L20-183 · acfd6ee4

由 Alex Deucher 提交于 6月 19, 2017

Fixes resume from suspend.

bug: https://bugzilla.kernel.org/show_bug.cgi?id=196121Reported-by: NPrzemek <soprwa@gmail.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

acfd6ee4

drm/radeon: add a PX quirk for another K53TK variant · 4eb59793

由 Alex Deucher 提交于 6月 19, 2017

Disable PX on these systems.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=101491Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

4eb59793

drm/amdgpu: adjust default display clock · 52b482b0

由 Alex Deucher 提交于 6月 15, 2017

Increase the default display clock on newer asics to
accomodate some high res modes with really high refresh
rates.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=93826Acked-by: NChunming Zhou <david1.zhou@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

52b482b0

drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating · 05b4017b

由 Alex Deucher 提交于 6月 15, 2017

We were using the wrong structure which lead to an overflow
on some boards.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=101387Acked-by: NChunming Zhou <david1.zhou@amd.com>
Acked-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

05b4017b

20 6月, 2017 5 次提交

drm/msm: Fix potential buffer overflow issue · 4a630fad

由 Kasin Li 提交于 6月 19, 2017

In function submit_create, if nr_cmds or nr_bos is assigned with
negative value, the allocated buffer may be small than intended.
Using this buffer will lead to buffer overflow issue.
Signed-off-by: NKasin Li <donglil@codeaurora.org>
Signed-off-by: NJordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: NRob Clark <robdclark@gmail.com>

4a630fad

drm/amdgpu: Optimize mutex usage (v4) · 5ac55629

由 Alex Xie 提交于 6月 16, 2017

In original function amdgpu_bo_list_get, the waiting
for result->lock can be quite long while mutex
bo_list_lock was holding. It can make other tasks
waiting for bo_list_lock for long period.

Secondly, this patch allows several tasks(readers of idr)
to proceed at the same time.

v2: use rcu and kref (Dave Airlie and Christian König)
v3: update v1 commit message (Michel Dänzer)
v4: rebase on upstream (Alex Deucher)
Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>

5ac55629

drm/amdgpu: Optimization of AMDGPU_BO_LIST_OP_CREATE (v2) · 99eea4df

由 Alex Xie 提交于 6月 16, 2017

v2: Remove duplication of zeroing of bo list (Christian König)
    Move idr_alloc function to end of ioctl (Christian König)
    Call kfree bo_list when amdgpu_bo_list_set return error.
    Combine the previous two patches into this patch.
    Add amdgpu_bo_list_set function prototype.
Signed-off-by: NAlex Xie <AlexBin.Xie@amd.com>
Reviewed-by: NChristian König <christian.koenig@amd.com>

99eea4df

drm/amdgpu: add Polaris12 DID · 6e88491c

由 Junshan Fang 提交于 6月 15, 2017

Signed-off-by: NJunshan Fang <Junshan.Fang@amd.com>
Reviewed-by: NRoger.He <Hongbo.He@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6e88491c

drm/i915: Don't enable backlight at setup time. · a8ae0a77

由 Dhinakaran Pandiyan 提交于 6月 19, 2017

Maarten and Ville noticed that we are enabling backlight via DP aux very
early in the modeset_init path via the intel_dp_aux_setup_backlight()
function, since commit e7156c83 ("drm/i915: Add Backlight Control using
DPCD for eDP connectors (v9)"). Looks like all we need to do during
_setup_backlight() is read the current brightness state instead of
modifying it.

v2: Rewrote commit message.

Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Yetunde Adebisi <yetundex.adebisi@intel.com>
Signed-off-by: NDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: NJani Nikula <jani.nikula@intel.com>
Tested-by: NPuthikorn Voravootivat <puthik@chromium.org>
Fixes: e7156c83 ("drm/i915: Add Backlight Control using DPCD for eDP connectors (v9)")
Link: http://patchwork.freedesktop.org/patch/msgid/1497384239-2965-1-git-send-email-dhinakaran.pandiyan@intel.comSigned-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit f6262bda)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497895708-19422-1-git-send-email-dhinakaran.pandiyan@intel.com

a8ae0a77

19 6月, 2017 6 次提交

drm/i915: Plumb the correct acquire ctx into intel_crtc_disable_noatomic() · b7f5dd36

由 Ville Syrjälä 提交于 6月 01, 2017

If intel_crtc_disable_noatomic() were to ever get called during resume
we'd end up deadlocking since resume has its own acqcuire_ctx but
intel_crtc_disable_noatomic() still tries to use the
mode_config.acquire_ctx. Pass down the correct acquire ctx from the top.

Cc: stable@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: e2c8b870 ("drm/i915: Use atomic helpers for suspend, v2.")
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-3-ville.syrjala@linux.intel.comReviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
(cherry picked from commit da1d0e26)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

b7f5dd36

drm/i915: Fix deadlock witha the pipe A quirk during resume · 17b206c2

由 Ville Syrjälä 提交于 6月 01, 2017

Pass down the correct acquire context to the pipe A quirk load detect
hack during display resume. Avoids deadlocking the entire thing.

Cc: stable@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: e2c8b870 ("drm/i915: Use atomic helpers for suspend, v2.")
Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-2-ville.syrjala@linux.intel.comReviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
(cherry picked from commit aecd36b8)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

17b206c2

drm/i915: Remove __GFP_NORETRY from our buffer allocator · ce2c5872

由 Chris Wilson 提交于 6月 09, 2017

I tried __GFP_NORETRY in the belief that __GFP_RECLAIM was effective. It
struggles with handling reclaim of our dirty buffers and relies on
reclaim via kswapd. As a result, a single pass of direct reclaim is
unreliable when i915 occupies the majority of available memory, and the
only means of effectively waiting on kswapd to amke progress is by not
setting the __GFP_NORETRY flag and lopping. That leaves us with the
dilemma of invoking the oomkiller instead of propagating the allocation
failure back to userspace where it can be handled more gracefully (one
hopes). In the future we may have __GFP_MAYFAIL to allow repeats up until
we genuinely run out of memory and the oomkiller would have been invoked.
Until then, let the oomkiller wreck havoc.

v2: Stop playing with side-effects of gfp flags and await __GFP_MAYFAIL
v3: Update comments that direct reclaim only appears to be ignoring our
dirty buffers!

Fixes: 24f8e00a ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations")
Testcase: igt/gem_tiled_swapping
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michal Hocko <mhocko@suse.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-2-chris@chris-wilson.co.ukReviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit eaf41801)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

ce2c5872

drm/i915: Encourage our shrinker more when our shmemfs allocations fails · b8d5a9cc

由 Chris Wilson 提交于 6月 09, 2017

Commit 24f8e00a ("drm/i915: Prefer to report ENOMEM rather than
incur the oom for gfx allocations") made the bold decision to try and
avoid the oomkiller by reporting -ENOMEM to userspace if our allocation
failed after attempting to free enough buffer objects. In short, it
appears we were giving up too easily (even before we start wondering if
one pass of reclaim is as strong as we would like). Part of the problem
is that if we only shrink just enough pages for our expected allocation,
the likelihood of those pages becoming available to us is less than 100%
To counter-act that we ask for twice the number of pages to be made
available. Furthermore, we allow the shrinker to pull pages from the
active list in later passes.

v2: Be a little more cautious in paging out gfx buffers, and leave that
to a more balanced approach from shrink_slab(). Important when combined
with "drm/i915: Start writeback from the shrinker" as anything shrunk is
immediately swapped out and so should be more conservative.

Fixes: 24f8e00a ("drm/i915: Prefer to report ENOMEM rather than incur the oom for gfx allocations")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-1-chris@chris-wilson.co.uk
(cherry picked from commit 4846bf0c)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>

b8d5a9cc

drm/i915: Differentiate between sw write location into ring and last hw read · a21ef715

由 Chris Wilson 提交于 6月 15, 2017

We need to keep track of the last location we ask the hw to read up to
(RING_TAIL) separately from our last write location into the ring, so
that in the event of a GPU reset we do not tell the HW to proceed into
a partially written request (which can happen if that request is waiting
for an external signal before being executed).

v2: Refactor intel_ring_reset() (Mika)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100144
Testcase: igt/gem_exec_fence/await-hang
Fixes: 821ed7df ("drm/i915: Update reset path to fix incomplete requests")
Fixes: d55ac5bf ("drm/i915: Defer transfer onto execution timeline to actual hw submission")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170425130049.26147-1-chris@chris-wilson.co.ukReviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
(cherry picked from commit e6ba9992)
Signed-off-by: NJani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615131129.3061-1-chris@chris-wilson.co.uk

a21ef715

D
drm/i915: Update DRIVER_DATE to 20170619 · 9ddb8e17
由 Daniel Vetter 提交于 6月 19, 2017
```
Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
```
9ddb8e17

17 6月, 2017 7 次提交

drm/msm: Separate locking of buffer resources from struct_mutex · 0e08270a

由 Sushmita Susheelendra 提交于 6月 13, 2017

Buffer object specific resources like pages, domains, sg list
need not be protected with struct_mutex. They can be protected
with a buffer object level lock. This simplifies locking and
makes it easier to avoid potential recursive locking scenarios
for SVM involving mmap_sem and struct_mutex. This also removes
unnecessary serialization when creating buffer objects, and also
between buffer object creation and GPU command submission.
Signed-off-by: NSushmita Susheelendra <ssusheel@codeaurora.org>
[robclark: squash in handling new locking for shrinker]
Signed-off-by: NRob Clark <robdclark@gmail.com>

0e08270a

B
drm/nouveau/disp/nv50-: avoid creating ORs that aren't present on HW · 7df1bb87
由 Ben Skeggs 提交于 6月 17, 2017
```
Signed-off-by: NBen Skeggs <bskeggs@redhat.com>
```
7df1bb87

drm/i915/cfl: Introduce Coffee Lake workarounds. · 46c26662

由 Rodrigo Vivi 提交于 6月 16, 2017

Coffee Lake inherit most of Kabylake production
workarounds.

v2: Fix typo on commit message and remove
    WaDisableKillLogic and GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC,
    since as Mika pointed out they shouldn't be here for cfl
    according to BSpec.

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: NMika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497653398-15722-1-git-send-email-rodrigo.vivi@intel.com

46c26662

amdgpu: use drm sync objects for shared semaphores (v6) · 660e8558

由 Dave Airlie 提交于 3月 13, 2017

This creates a new command submission chunk for amdgpu
to add in and out sync objects around the submission.

Sync objects are managed via the drm syncobj ioctls.

The command submission interface is enhanced with two new
chunks, one for syncobj pre submission dependencies,
and one for post submission sync obj signalling,
and just takes a list of handles for each.

This is based on work originally done by David Zhou at AMD,
with input from Christian Konig on what things should look like.

In theory VkFences could be backed with sync objects and
just get passed into the cs as syncobj handles as well.

NOTE: this interface addition needs a version bump to expose
it to userspace.

TODO: update to dep_sync when rebasing onto amdgpu master.
(with this - r-b from Christian)

v1.1: keep file reference on import.
v2: move to using syncobjs
v2.1: change some APIs to just use p pointer.
v3: make more robust against CS failures, we now add the
wait sems but only remove them once the CS job has been
submitted.
v4: rewrite names of API and base on new syncobj code.
v5: move post deps earlier, rename some apis
v6: lookup post deps earlier, and just replace fences
in post deps stage (Christian)
Reviewed-by: NChristian König <christian.koenig@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

660e8558

amdgpu/cs: split out fence dependency checking (v2) · 6f0308eb

由 Dave Airlie 提交于 3月 09, 2017

This just splits out the fence depenency checking into it's
own function to make it easier to add semaphore dependencies.

v2: rebase onto other changes.

v1-Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

6f0308eb

drm/amdgpu: don't check the default value for vm size · 64dab074

由 Alex Deucher 提交于 6月 15, 2017

Avoids printing spurious messages like this:
[    3.102059] amdgpu 0000:01:00.0: VM size (-1) must be a power of 2
Reviewed-by: NChristian König <christian.koenig@amd.com>
Reviewed-by: NMichel Dänzer <michel.daenzer@amd.com>
Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>

64dab074

drm/i915: Store 9 bits of PCI Device ID for platforms with a LP PCH · 28e0f4ee

由 Dhinakaran Pandiyan 提交于 6月 16, 2017

Although we use 9 bits of Device ID for identifying PCH, only 8 bits are
stored in dev_priv->pch_id. This makes HAS_PCH_CNP_LP() and
HAS_PCH_SPT_LP() incorrect. Fix this by storing all the 9 bits for the
platforms with LP PCH.

v2: Drop PCH_LPT_LP change (Imre)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Fixes: commit ec7e0bb3 ("drm/i915/cnp: Add PCI ID for Cannonpoint LP PCH")
Reported-by: NImre Deak <imre.deak@intel.com>
Reviewed-by: NImre Deak <imre.deak@intel.com>
Signed-off-by: NDhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: NImre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497641774-29104-1-git-send-email-dhinakaran.pandiyan@intel.com

28e0f4ee

16 6月, 2017 9 次提交

drm/i915: Stash a pointer to the obj's resv in the vma · 95ff7c7d

由 Chris Wilson 提交于 6月 16, 2017

During execbuf, a mandatory step is that we add this request (this
fence) to each object's reservation_object. Inside execbuf, we track the
vma, and to add the fence to the reservation_object then means having to
first chase the obj, incurring another cache miss. We can reduce the
number of cache misses by stashing a pointer to the reservation_object
in the vma itself.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170616140525.6394-1-chris@chris-wilson.co.uk

95ff7c7d

drm/i915: Async GPU relocation processing · 7dd4f672

由 Chris Wilson 提交于 6月 16, 2017

If the user requires patching of their batch or auxiliary buffers, we
currently make the alterations on the cpu. If they are active on the GPU
at the time, we wait under the struct_mutex for them to finish executing
before we rewrite the contents. This happens if shared relocation trees
are used between different contexts with separate address space (and the
buffers then have different addresses in each), the 3D state will need
to be adjusted between execution on each context. However, we don't need
to use the CPU to do the relocation patching, as we could queue commands
to the GPU to perform it and use fences to serialise the operation with
the current activity and future - so the operation on the GPU appears
just as atomic as performing it immediately. Performing the relocation
rewrites on the GPU is not free, in terms of pure throughput, the number
of relocations/s is about halved - but more importantly so is the time
under the struct_mutex.

v2: Break out the request/batch allocation for clearer error flow.
v3: A few asserts to ensure rq ordering is maintained
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

7dd4f672

drm/i915: Allow execbuffer to use the first object as the batch · 1a71cf2f

由 Chris Wilson 提交于 6月 16, 2017

Currently, the last object in the execlist is the always the batch.
However, when building the batch buffer we often know the batch object
first and if we can use the first slot in the execlist we can emit
relocation instructions relative to it immediately and avoid a separate
pass to adjust the relocations to point to the last execlist slot.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

1a71cf2f

drm/i915: Wait upon userptr get-user-pages within execbuffer · 8a2421bd

由 Chris Wilson 提交于 6月 16, 2017

This simply hides the EAGAIN caused by userptr when userspace causes
resource contention. However, it is quite beneficial with highly
contended userptr users as we avoid repeating the setup costs and
kernel-user context switches.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NMichał Winiarski <michal.winiarski@intel.com>

8a2421bd

drm/i915: First try the previous execbuffer location · 616d9cee

由 Chris Wilson 提交于 6月 16, 2017

When choosing a slot for an execbuffer, we ideally want to use the same
address as last time (so that we don't have to rebind it) and the same
address as expected by the user (so that we don't have to fixup any
relocations pointing to it). If we first try to bind the incoming
execbuffer->offset from the user, or the currently bound offset that
should hopefully achieve the goal of avoiding the rebind cost and the
relocation penalty. However, if the object is not currently bound there
we don't want to arbitrarily unbind an object in our chosen position and
so choose to rebind/relocate the incoming object instead. After we
report the new position back to the user, on the next pass the
relocations should have settled down.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtien@linux.intel.com>

616d9cee

drm/i915: Store a persistent reference for an object in the execbuffer cache · dade2a61

由 Chris Wilson 提交于 6月 16, 2017

If we take a reference to the object/vma when it is first used in an
execbuf, we can keep that reference until the object's file-local handle
is closed. Thereby saving a frequent ref/unref pair.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

dade2a61

drm/i915: Eliminate lots of iterations over the execobjects array · 2889caa9

由 Chris Wilson 提交于 6月 16, 2017

The major scaling bottleneck in execbuffer is the processing of the
execobjects. Creating an auxiliary list is inefficient when compared to
using the execobject array we already have allocated.

Reservation is then split into phases. As we lookup up the VMA, we
try and bind it back into active location. Only if that fails, do we add
it to the unbound list for phase 2. In phase 2, we try and add all those
objects that could not fit into their previous location, with fallback
to retrying all objects and evicting the VM in case of severe
fragmentation. (This is the same as before, except that phase 1 is now
done inline with looking up the VMA to avoid an iteration over the
execobject array. In the ideal case, we eliminate the separate reservation
phase). During the reservation phase, we only evict from the VM between
passes (rather than currently as we try to fit every new VMA). In
testing with Unreal Engine's Atlantis demo which stresses the eviction
logic on gen7 class hardware, this speed up the framerate by a factor of
2.

The second loop amalgamation is between move_to_gpu and move_to_active.
As we always submit the request, even if incomplete, we can use the
current request to track active VMA as we perform the flushes and
synchronisation required.

The next big advancement is to avoid copying back to the user any
execobjects and relocations that are not changed.

v2: Add a Theory of Operation spiel.
v3: Fall back to slow relocations in preparation for flushing userptrs.
v4: Document struct members, factor out eb_validate_vma(), add a few
more comments to explain some magic and hide other magic behind macros.
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

2889caa9

drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocations · 071750e5

由 Chris Wilson 提交于 6月 16, 2017

If we write a relocation into the buffer, we require our own implicit
synchronisation added after the start of the execbuf, outside of the
user's control. As we may end up clflushing, or doing the patch itself
on the GPU, asynchronously we need to look at the implicit serialisation
on obj->resv and hence need to disable EXEC_OBJECT_ASYNC for this
object.

If the user does trigger a stall for relocations, we make sure the stall
is complete enough so that the batch is not submitted before we complete
those relocations.

Fixes: 77ae9957 ("drm/i915: Enable userspace to opt-out of implicit fencing")
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

071750e5

drm/i915: Pass vma to relocate entry · 507d977f

由 Chris Wilson 提交于 6月 16, 2017

We can simplify our tracking of pending writes in an execbuf to the
single bit in the vma->exec_entry->flags, but that requires the
relocation function knowing the object's vma. Pass it along.

Note we have only been using a single bit to track flushing since

commit cc889e0f
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Jun 13 20:45:19 2012 +0200

    drm/i915: disable flushing_list/gpu_write_list

unconditionally flushed all render caches before the breadcrumb and

commit 6ac42f41
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Sat Jul 21 12:25:01 2012 +0200

    drm/i915: Replace the complex flushing logic with simple invalidate/flush all

did away with the explicit GPU domain tracking. This was then codified
into the ABI with NO_RELOC in

commit ed5982e6
Author: Daniel Vetter <daniel.vetter@ffwll.ch> # Oi! Patch stealer!
Date:   Thu Jan 17 22:23:36 2013 +0100

    drm/i915: Allow userspace to hint that the relocations were known
Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>

507d977f

openeuler / Kernel 1 年多 前同步成功

openeuler / Kernel
1 年多前同步成功