- 06 8月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
To maintain a fast lookup from a GT centric irq handler, we want the engine lookup tables on the intel_gt. To avoid having multiple copies of the same multi-dimension lookup table, move the generic user engine lookup into an rbtree (for fast and flexible indexing). v2: Split uabi_instance cf uabi_class v3: Set uabi_class/uabi_instance after collating all engines to provide a stable uabi across parallel unordered construction. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk
-
- 31 7月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
My plan for the future is to have kernel contexts not to have a GEM context backpointer (as they will not belong to any GEM context). In a few places, we use ce->gem_context to simply obtain the i915 backpointer, for which we can use ce->engine->i915 instead. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190730163441.16477-1-chris@chris-wilson.co.uk
-
- 26 7月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Smatch warning that the loop may be empty causing us to check err before it had been set. Ensure that it is initialised to 0, just in case. v2: Refactor the inner loop for better scooping and clarity Fixes: a9877da2 ("drm/i915/oa: Reconfigure contexts on the fly") Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190726131458.8310-1-chris@chris-wilson.co.uk
-
- 19 7月, 2019 1 次提交
-
-
由 Matteo Croce 提交于
In the sysctl code the proc_dointvec_minmax() function is often used to validate the user supplied value between an allowed range. This function uses the extra1 and extra2 members from struct ctl_table as minimum and maximum allowed value. On sysctl handler declaration, in every source file there are some readonly variables containing just an integer which address is assigned to the extra1 and extra2 members, so the sysctl range is enforced. The special values 0, 1 and INT_MAX are very often used as range boundary, leading duplication of variables like zero=0, one=1, int_max=INT_MAX in different source files: $ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l 248 Add a const int array containing the most commonly used values, some macros to refer more easily to the correct array member, and use them instead of creating a local one for every object file. This is the bloat-o-meter output comparing the old and new binary compiled with the default Fedora config: # scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164) Data old new delta sysctl_vals - 12 +12 __kstrtab_sysctl_vals - 12 +12 max 14 10 -4 int_max 16 - -16 one 68 - -68 zero 128 28 -100 Total: Before=20583249, After=20583085, chg -0.00% [mcroce@redhat.com: tipc: remove two unused variables] Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com [akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c] [arnd@arndb.de: proc/sysctl: make firmware loader table conditional] Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de [akpm@linux-foundation.org: fix fs/eventpoll.c] Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.comSigned-off-by: NMatteo Croce <mcroce@redhat.com> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NKees Cook <keescook@chromium.org> Reviewed-by: NAaron Tomlin <atomlin@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 7月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
Avoid a global idle barrier by reconfiguring each context by rewriting them with MI_STORE_DWORD from the kernel context. v2: We only need to determine the desired register values once, they are the same for all contexts. v3: Don't remove the kernel context from the list of known GEM contexts; the world is not ready for that yet. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190716213443.9874-1-chris@chris-wilson.co.uk
-
- 10 7月, 2019 2 次提交
-
-
由 Lionel Landwerlin 提交于
This was dropped from the original patch series, we weren't sure whether it was needed at the time. More recent tests show it's definitely needed to have acurate performance data. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 19f81df2 ("drm/i915/perf: Add OA unit support for Gen 8+") Acked-by: NChris Wilson <chris@chris-wilson.co.uk> [ickle: combine duplicate code and comments] Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190710105524.23017-1-chris@chris-wilson.co.uk
-
由 Lionel Landwerlin 提交于
The i915 perf stream has its own file descriptor and is tied to reference of the driver. We haven't taken care of keep the driver alive. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: NChris Wilson <chris@chris-wilson.co.uk> Fixes: eec688e1 ("drm/i915: Add i915 perf infrastructure") Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190709123351.5645-2-lionel.g.landwerlin@intel.com
-
- 27 6月, 2019 1 次提交
-
-
由 Michal Wajdeczko 提交于
OA files look to be auto-generated so we can keep them all in dedicated subdirectory. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: NJani Nikula <jani.nikula@intel.com> Reviewed-by: NUmesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190626123826.39760-1-michal.wajdeczko@intel.com
-
- 25 6月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
We got the wrong offsets (could they have changed?). New values were computed off an error state by looking up the register offset in the context image as written by the HW. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 1de401c0 ("drm/i915/perf: enable perf support on ICL") Acked-by: NKenneth Graunke <kenneth@whitecape.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190610081914.25428-1-lionel.g.landwerlin@intel.com
-
- 14 6月, 2019 1 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
The functions where internally already only using the structure, so we need to just flip the interface. v2: rebase Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NImre Deak <imre.deak@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com
-
- 12 6月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
Gen10 added an additional NOA_WRITE register (high bits) and we forgot to whitelist it for userspace. Fixes: 95690a02 ("drm/i915/perf: enable perf support on CNL") Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NKenneth Graunke <kenneth@whitecape.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190601225845.12600-1-lionel.g.landwerlin@intel.com (cherry picked from commit bf210f6c) Signed-off-by: NJani Nikula <jani.nikula@intel.com>
-
- 10 6月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
Gen10 added an additional NOA_WRITE register (high bits) and we forgot to whitelist it for userspace. Fixes: 95690a02 ("drm/i915/perf: enable perf support on CNL") Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NKenneth Graunke <kenneth@whitecape.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190601225845.12600-1-lionel.g.landwerlin@intel.com
-
- 28 5月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Continuing the theme of separating out the GEM clutter. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Split the plain old shmem object into its own file to start decluttering i915_gem.c v2: Lose the confusing, hysterical raisins, suffix of _gtt. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-4-chris@chris-wilson.co.uk
-
- 27 4月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
We switched to a tree of per-engine HW context to accommodate the introduction of virtual engines. However, we plan to also support multiple instances of the same engine within the GEM context, defeating our use of the engine as a key to looking up the HW context. Just allocate a logical per-engine instance and always use an index into the ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user specified map. v2: Add for_each_gem_engine() helper to iterator within the engines lock v3: intel_context_create_request() helper v4: s/unsigned long/unsigned int/ 4 billion engines is quite enough. v5: Push iterator locking to caller Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-7-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
We want to pass in a intel_context into intel_context_pin() and that requires us to first be able to lookup the intel_context! Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-2-chris@chris-wilson.co.uk
-
- 25 4月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Start acquiring the logical intel_context and using that as our primary means for request allocation. This is the initial step to allow us to avoid requiring struct_mutex for request allocation along the perma-pinned kernel context, but it also provides a foundation for breaking up the complex request allocation to handle different scenarios inside execbuf. For the purpose of emitting a request from inside retirement (see the next patch for engine power management), we also need to lift control over the timeline mutex to the caller. v2: Note that the request carries the active reference upon construction. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-4-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Start partitioning off the code that talks to the hardware (GT) from the uapi layers and move the device facing code under gt/ One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to subdivide that header and body further (and split out the submission code from the ringbuffer and logical context handling). This patch aims to be simple motion so git can fixup inflight patches with little mess. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NJani Nikula <jani.nikula@intel.com> Acked-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
-
- 24 4月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
As we push for better compartmentalisation, it is more convenient to copy the default sseu configuration from the engine into the derived logical context, than it is to dig it out from i915->runtime_info. v2: Use intel_sseu_from_device_info() to describe the converter Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424095134.30249-1-chris@chris-wilson.co.uk
-
- 27 3月, 2019 1 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
The intel_uncore structure is the owner of register access, so subclass the function to it. While at it, use a local uncore var and switch to the new read/write functions where it makes sense. Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190325214940.23632-9-daniele.ceraolospurio@intel.com
-
- 22 3月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
When we return pages to the system, we ensure that they are marked as being in the CPU domain since any external access is uncontrolled and we must assume the worst. This means that we need to always flush the pages on acquisition if we need to use them on the GPU, and from the beginning have used set-domain. Set-domain is overkill for the purpose as it is a general synchronisation barrier, but our intent is to only flush the pages being swapped in. If we move that flush into the pages acquisition phase, we know then that when we have obj->mm.pages, they are coherent with the GPU and need only maintain that status without resorting to heavy handed use of set-domain. The principle knock-on effect for userspace is through mmap-gtt pagefaulting. Our uAPI has always implied that the GTT mmap was async (especially as when any pagefault occurs is unpredicatable to userspace) and so userspace had to apply explicit domain control itself (set-domain). However, swapping is transparent to the kernel, and so on first fault we need to acquire the pages and make them coherent for access through the GTT. Our use of set-domain here leaks into the uABI that the first pagefault was synchronous. This is unintentional and baring a few igt should be unoticed, nevertheless we bump the uABI version for mmap-gtt to reflect the change in behaviour. Another implication of the change is that gem_create() is presumed to create an object that is coherent with the CPU and is in the CPU write domain, so a set-domain(CPU) following a gem_create() would be a minor operation that merely checked whether we could allocate all pages for the object. On applying this change, a set-domain(CPU) causes a clflush as we acquire the pages. This will have a small impact on mesa as we move the clflush here on !llc from execbuf time to create, but that should have minimal performance impact as the same clflush exists but is now done early and because of the clflush issue, userspace recycles bo and so should resist allocating fresh objects. Internally, the presumption that objects are created in the CPU write-domain and remain so through writes to obj->mm.mapping is more prevalent than I expected; but easy enough to catch and apply a manual flush. For the future, we should push the page flush from the central set_pages() into the callers so that we can more finely control when it is applied, but for now doing it one location is easier to validate, at the cost of sometimes flushing when there is no need. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.william.auld@gmail.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Antonio Argenziano <antonio.argenziano@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NMatthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190321161908.8007-1-chris@chris-wilson.co.uk
-
- 21 3月, 2019 1 次提交
-
-
由 Daniele Ceraolo Spurio 提交于
Now that the internal code all works on intel_uncore, flip the external-facing interface. v2: fix GVT. Signed-off-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: NPaulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190319183543.13679-4-daniele.ceraolospurio@intel.com
-
- 14 3月, 2019 1 次提交
-
-
由 Rodrigo Vivi 提交于
This exactly same approach was already used from gen9 to gen10 and from gen10 to gen11. Let's also use it for gen11+. Let's first assume that we inherit a similar platform and than we apply the differences on top. Different from the previous attempts this will be done this time with coccinelle. We obviously need to exclude some case that is really exclusive for gen11 like PCH, Firmware, and few others. Luckly this was easy to filter by selecting the files we are touching with coccinelle as exposed below: spatch -sp_file gen11\+.cocci --in-place i915_perf.c \ intel_bios.c intel_cdclk.c intel_ddi.c \ intel_device_info.c intel_display.c intel_dpll_mgr.c \ intel_dsi_vbt.c intel_hdmi.c intel_mocs.c intel_color.c @noticelake@ expression e; @@ -!IS_ICELAKE(e) +INTEL_GEN(e) < 11 @notgen11@ expression e; @@ -!IS_GEN(e, 11) +INTEL_GEN(e) < 11 @icelake@ expression e; @@ -IS_ICELAKE(e) +INTEL_GEN(e) >= 11 @gen11@ expression e; @@ -IS_GEN(e, 11) +INTEL_GEN(e) >= 11 No functional change. v2: Remove intel_lrc.c per Tvrtko request since those were w/a for ICL hw issuea and media related configuration. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NLucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190308214300.25057-1-rodrigo.vivi@intel.com
-
- 08 3月, 2019 1 次提交
-
-
由 Chris Wilson 提交于
In preparation for an ever growing number of engines and so ever increasing static array of HW contexts within the GEM context, move the array over to an rbtree, allocated upon first use. Unfortunately, this imposes an rbtree lookup at a few frequent callsites, but we should be able to mitigate those by moving over to using the HW context as our primary type and so only incur the lookup on the boundary with the user GEM context and engines. v2: Check for no HW context in guc_stage_desc_init Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-4-chris@chris-wilson.co.uk
-
- 06 3月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Instead of passing the gem_context and engine to find the instance of the intel_context to use, pass around the intel_context instead. This is useful for the next few patches, where the intel_context is no longer a direct lookup. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190306084704.15755-1-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
In the next patch, we are introducing a broad virtual engine to encompass multiple physical engines, losing the 1:1 nature of BIT(engine->id). To reflect the broader set of engines implied by the virtual instance, lets store the full bitmask. v2: Use intel_engine_mask_t (s/ring_mask/engine_mask/) v3: Tvrtko voted for moah churn so teach everyone to not mention ring and use $class$instance throughout. v4: Comment upon the disparity in bspec for using VCS1,VCS2 in gen8 and VCS[0-4] in later gen. We opt to keep the code consistent and use 0-index naming throughout. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190305180332.30900-1-chris@chris-wilson.co.uk
-
- 05 3月, 2019 1 次提交
-
-
由 Rodrigo Vivi 提交于
No functional change. Just a reorg to match the preferred behavior. When rebasing internal branch on top of latest sort I noticed few more cases that needs to get reordered. Let's do in a bundle this time and hoping there's no other missing places. v2: Check for HSW/BDW ULT before generic IS_HASWELL or IS_BROADWELL or it doesn't work as pointed by Ville. But also ULT came afterwards anyway. v3: Accepting suggestions from Lucas: Sort CNL/CFL, KBL/SKL, and use <= 8 removing chv and bdw. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: NLucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190301172703.12139-1-rodrigo.vivi@intel.com
-
- 05 2月, 2019 1 次提交
-
-
由 Lionel Landwerlin 提交于
If some of the contexts submitting workloads to the GPU have been configured to shutdown slices/subslices, we might loose the NOA configurations written in the NOA muxes. One possible solution to this problem is to reprogram the NOA muxes when we switch to a new context. We initially tried this in the workaround batchbuffer but some concerns where raised about the cost of reprogramming at every context switch. This solution is also not without consequences from the userspace point of view. Reprogramming of the muxes can only happen once the powergating configuration has changed (which happens after context switch). This means for a window of time during the recording, counters recorded by the OA unit might be invalid. This requires userspace dealing with OA reports to discard the invalid values. Minimizing the reprogramming could be implemented by tracking of the last programmed configuration somewhere in GGTT and use MI_PREDICATE to discard some of the programming commands, but the command streamer would still have to parse all the MI_LRI instructions in the workaround batchbuffer. Another solution, which this change implements, is to simply disregard the user requested configuration for the period of time when i915/perf is active. On most platforms there are no issues with this apart from a performance penality for some media workloads that benefit from running on a partially powergated GPU. We already prevent RC6 from affecting the programming so it doesn't sound completely unreasonable to hold on powergating for the same reason. On Icelake however there would a functional problem if the slices not- containing the VME block were left enabled with a running media workload which explicitly disabled them. To avoid a GPU hang in this case, on Icelake we lock the enablement to only slices which contain VME blocks. Downside is that it means degraded GPU performance when OA is active but there is no known alternative solution for this. v2: Leave RPCS programming in intel_lrc.c (Lionel) v3: Update for s/union intel_sseu/struct intel_sseu/ (Lionel) More to_intel_context() (Tvrtko) s/dev_priv/i915/ (Tvrtko) Tvrtko Ursulin: v4: * Rebase for make_rpcs changes. v5: * Apply OA restriction from make_rpcs directly. v6: * Rebase for context image setup changes. v7: * Move stream assignment before metric enable. v8-9: * Rebase. v10: * Squashed with ICL support patch. Bspec: 21140 Co-developed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v9 Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190205095032.22673-2-tvrtko.ursulin@linux.intel.com
-
- 17 1月, 2019 1 次提交
-
-
由 Jani Nikula 提交于
Mixed C99 and kernel types use is getting ugly. Prefer kernel types. sed -i 's/\buint\(8\|16\|32\|64\)_t\b/u\1/g' Minor checkpatch fixes sprinkled on top of the changed lines. Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NJosé Roberto de Souza <jose.souza@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/14ed72e7f04c9340a057855c5950b54811f8a477.1547629303.git.jani.nikula@intel.com
-
- 15 1月, 2019 2 次提交
-
-
由 Chris Wilson 提交于
Keep track of our wakeref used to keep the device awake so we can catch any leak. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-7-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
The majority of runtime-pm operations are bounded and scoped within a function; these are easy to verify that the wakeref are handled correctly. We can employ the compiler to help us, and reduce the number of wakerefs tracked when debugging, by passing around cookies provided by the various rpm_get functions to their rpm_put counterpart. This makes the pairing explicit, and given the required wakeref cookie the compiler can verify that we pass an initialised value to the rpm_put (quite handy for double checking error paths). For regular builds, the compiler should be able to eliminate the unused local variables and the program growth should be minimal. Fwiw, it came out as a net improvement as gcc was able to refactor rpm_get and rpm_get_if_in_use together, v2: Just s/rpm_put/rpm_put_unchecked/ everywhere, leaving the manual mark up for smaller more targeted patches. v3: Mention the cookie in Returns Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: NMika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-2-chris@chris-wilson.co.uk
-
- 04 1月, 2019 1 次提交
-
-
由 Linus Torvalds 提交于
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 02 1月, 2019 1 次提交
-
-
由 Jani Nikula 提交于
First move the low hanging fruit, the fields that are only initialized runtime. Use RUNTIME_INFO() exclusively to access the fields. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c24fe7a4b0492a888690c46814c0ff21ce2f12b1.1546267488.git.jani.nikula@intel.com
-
- 13 12月, 2018 3 次提交
-
-
由 Lucas De Marchi 提交于
Instead of using IS_GEN() for consecutive gen checks, let's pass the range to IS_GEN_RANGE(). By code inspection these were the ranges deemed necessary for spatch: @@ expression e; @@ ( - IS_GEN(e, 3) || IS_GEN(e, 2) + IS_GEN_RANGE(e, 2, 3) | - IS_GEN(e, 3) || IS_GEN(e, 4) + IS_GEN_RANGE(e, 3, 4) | - IS_GEN(e, 5) || IS_GEN(e, 6) + IS_GEN_RANGE(e, 5, 6) | - IS_GEN(e, 6) || IS_GEN(e, 7) + IS_GEN_RANGE(e, 6, 7) | - IS_GEN(e, 7) || IS_GEN(e, 8) + IS_GEN_RANGE(e, 7, 8) | - IS_GEN(e, 8) || IS_GEN(e, 9) + IS_GEN_RANGE(e, 8, 9) | - IS_GEN(e, 10) || IS_GEN(e, 9) + IS_GEN_RANGE(e, 9, 10) | - IS_GEN(e, 9) || IS_GEN(e, 10) + IS_GEN_RANGE(e, 9, 10) ) After conversion, checking we don't have any missing IS_GEN_RANGE() || IS_GEN() was also done. Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-3-lucas.demarchi@intel.com
-
由 Lucas De Marchi 提交于
Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of gen_mask to do the comparison. Now callers can pass then gen as a parameter, so we don't require one macro for each gen. The following spatch was used to convert the users of these macros: @@ expression e; @@ ( - IS_GEN2(e) + IS_GEN(e, 2) | - IS_GEN3(e) + IS_GEN(e, 3) | - IS_GEN4(e) + IS_GEN(e, 4) | - IS_GEN5(e) + IS_GEN(e, 5) | - IS_GEN6(e) + IS_GEN(e, 6) | - IS_GEN7(e) + IS_GEN(e, 7) | - IS_GEN8(e) + IS_GEN(e, 8) | - IS_GEN9(e) + IS_GEN(e, 9) | - IS_GEN10(e) + IS_GEN(e, 10) | - IS_GEN11(e) + IS_GEN(e, 11) ) v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than using the bitmask Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com
-
由 Lucas De Marchi 提交于
RANGE makes it longer, but clearer. We are also going to add a macro to check an individual gen, so add the _RANGE prefix here. Diff generated with: sed 's/IS_GEN(/IS_GEN_RANGE(/g' drivers/gpu/drm/i915/{*/,}*.{c,h} -i v2: use IS_GEN rather than GT_GEN Signed-off-by: NLucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-1-lucas.demarchi@intel.com
-
- 19 11月, 2018 2 次提交
-
-
由 Joonas Lahtinen 提交于
Userspace portion is still missing. This reverts commit 9fa6e2f7. Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181116135510.13807-2-joonas.lahtinen@linux.intel.com
-
由 Joonas Lahtinen 提交于
Userspace portion is still missing. This reverts commit cd956bfc. Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181116135510.13807-1-joonas.lahtinen@linux.intel.com
-
- 24 10月, 2018 1 次提交
-
-
由 Lionel Landwerlin 提交于
Forgot to add the description of this option in a previous commit. Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: cd956bfc ("drm/i915/perf: add a parameter to control the size of OA buffer") Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20181024105158.4732-1-lionel.g.landwerlin@intel.com
-
- 23 10月, 2018 1 次提交
-
-
由 Lionel Landwerlin 提交于
The way our hardware is designed doesn't seem to let us use the MI_RECORD_PERF_COUNT command without setting up a circular buffer. In the case where the user didn't request OA reports to be available through the i915 perf stream, we can set the OA buffer to the minimum size to avoid consuming memory which won't be used by the driver. v2: Simplify oa buffer size exponent selection (Chris) Reuse vma size field (Lionel) v3: Restrict size opening parameter to values supported by HW (Chris) v4: Drop out of date comment (Matt) Add debug message when buffer size is rejected (Matt) Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181023100707.31738-5-lionel.g.landwerlin@intel.com
-