- 26 5月, 2017 1 次提交
-
-
由 Michal Wajdeczko 提交于
Buffer based command transport can replace MMIO based mechanism. It may be used to perform host-2-guc and guc-to-host communication. Portions of this patch are based on work by: Michel Thierry <michel.thierry@intel.com> Robert Beckett <robert.beckett@intel.com> Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> v2: use gem_object_pin_map (Chris) don't use DEBUG_RATELIMITED (Chris) don't track action stats (Chris) simplify next fence (Chris) use READ_ONCE (Chris) move blob allocation to new function (Chris) v3: use static owner id (Daniele) v4: but keep channel initialization generic (Daniele) and introduce owner_sub_id (Daniele) Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170526111326.87280-3-michal.wajdeczko@intel.com
-
- 03 5月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Track the latest fence waited upon on each context, and only add a new asynchronous wait if the new fence is more recent than the recorded fence for that context. This requires us to filter out unordered timelines, which are noted by DMA_FENCE_NO_CONTEXT. However, in the absence of a universal identifier, we have to use our own i915->mm.unordered_timeline token. v2: Throw around the debug crutches v3: Inline the likely case of the pre-allocation cache being full. v4: Drop the pre-allocation support, we can lose the most recent fence in case of allocation failure -- it just means we may emit more awaits than strictly necessary but will not break. v5: Trim allocation size for leaf nodes, they only need an array of u32 not pointers. v6: Create mock_timeline to tidy selftest writing v7: s/intel_timeline_sync_get/intel_timeline_sync_is_later/ (Tvrtko) v8: Prune the stale sync points when we idle. v9: Include a small benchmark in the kselftests v10: Separate the idr implementation into its own compartment. (Tvrkto) v11: Refactor igt_sync kselftests to avoid deep nesting (Tvrkto) v12: __sync_leaf_idx() to assert that p->height is 0 when checking leaves v13: kselftests to investigate struct i915_syncmap itself (Tvrtko) v14: Foray into ascii art graphs v15: Take into account that the random lookup/insert does 2 prng calls, not 1, when benchmarking, and use for_each_set_bit() (Tvrtko) v16: Improved ascii art Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170503093924.5320-4-chris@chris-wilson.co.uk
-
- 07 3月, 2017 1 次提交
-
-
由 Jani Nikula 提交于
Emphasize that the VBT file is nowadays more about initializing and running stuff based on the VBT contents, not so much about being a "panel driver". No functional changes. Reviewed-by: NHans de Goede <hdegoede@redhat.com> Reviewed-by: NBob Paauwe <bob.j.paauwe@intel.com> Cc: Madhav Chauhan <madhav.chauhan@intel.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Bob Paauwe <bob.j.paauwe@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/b13cb012a555ff5eb56b5e4bb2b0205c3e025a99.1488810382.git.jani.nikula@intel.com
-
- 22 2月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Flushing the cachelines for an object is slow, can be as much as 100ms for a large framebuffer. We currently do this under the struct_mutex BKL on execution or on pageflip. But now with the ability to add fences to obj->resv for both flips and execbuf (and we naturally wait on the fence before CPU access), we can move the clflush operation to a workqueue and signal a fence for completion, thereby doing the work asynchronously and not blocking the driver or its clients. v2: Introduce i915_gem_clflush.h and use a new name, split out some extras into separate patches. Suggested-by: NAkash Goel <akash.goel@intel.com> Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170222114049.28456-5-chris@chris-wilson.co.uk
-
- 14 2月, 2017 1 次提交
-
-
由 Chris Wilson 提交于
Some pieces of code are independent of hardware but are very tricky to exercise through the normal userspace ABI or via debugfs hooks. Being able to create mock unit tests and execute them through CI is vital. Start by adding a central point where we can execute unit tests and a parameter to enable them. This is disabled by default as the expectation is that these tests will occasionally explode. To facilitate integration with igt, any parameter beginning with i915.igt__ is interpreted as a subtest executable independently via igt/drv_selftest. Two classes of selftests are recognised: mock unit tests and integration tests. Mock unit tests are run as soon as the module is loaded, before the device is probed. At that point there is no driver instantiated and all hw interactions must be "mocked". This is very useful for writing universal tests to exercise code not typically run on a broad range of architectures. Alternatively, you can hook into the live selftests and run when the device has been instantiated - hw interactions are real. v2: Add a macro for compiling conditional code for mock objects inside real objects. v3: Differentiate between mock unit tests and late integration test. v4: List the tests in natural order, use igt to sort after modparam. v5: s/late/live/ v6: s/unsigned long/unsigned int/ v7: Use igt_ prefixes for long helpers. v8: Deobfuscate macros overriding functions, stop using -I$(src) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170213171558.20942-1-chris@chris-wilson.co.uk
-
- 09 2月, 2017 1 次提交
-
-
由 Ville Syrjälä 提交于
Let's try to shrink intel_display.c a bit by moving the cdclk/rawclk stuff to a new file. It's all reasonably self contained so we don't even have to add that many non-static symbols. We'll also take the opportunity to shuffle around the functions a bit to get things in a more consistent order based on the platform. v2: Add kernel-docs (Ander) v3: Deal with IS_GEN9_BC() v4: Deal with i945gm_get_cdclk() Signed-off-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: NAnder Conselvan de Oliveira <conselvan2@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170207183305.19656-1-ville.syrjala@linux.intel.com
-
- 25 1月, 2017 1 次提交
-
-
由 Jerome Anand 提交于
Enable support for HDMI LPE audio mode on Baytrail and Cherrytrail when HDaudio controller is not detected Setup minimum required resources during i915_driver_load: 1. Create a platform device to share MMIO/IRQ resources 2. Make the platform device child of i915 device for runtime PM. 3. Create IRQ chip to forward HDMI LPE audio irqs. HDMI LPE audio driver (a standalone sound driver) probes the LPE audio device and creates a new sound card. Signed-off-by: NPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: NJerome Anand <jerome.anand@intel.com> Acked-by: NJani Nikula <jani.nikula@intel.com> Signed-off-by: NTakashi Iwai <tiwai@suse.de>
-
- 19 1月, 2017 1 次提交
-
-
由 Anusha Srivatsa 提交于
The HuC loading process is similar to GuC. The intel_uc_fw_fetch() is used for both cases. HuC loading needs to be before GuC loading. The WOPCM setting must be done early before loading any of them. v2: rebased on-top of drm-intel-nightly. removed if(HAS_GUC()) before the guc call. (D.Gordon) update huc_version number of format. v3: rebased to drm-intel-nightly, changed the file name format to match the one in the huc package. Changed dev->dev_private to to_i915() v4: moved function back to where it was. change wait_for_atomic to wait_for. v5: rebased. Changed the year in the copyright message to reflect the right year.Correct the comments,remove the unwanted WARN message, replace drm_gem_object_unreference() with i915_gem_object_put().Make the prototypes in intel_huc.h non-extern. v6: rebased. Update the file construction done by HuC. It is similar to GuC.Adopted the approach used in- https://patchwork.freedesktop.org/patch/104355/ <Tvrtko Ursulin> v7: Change dev to dev_priv in macro definition. Corrected comments. v8: rebased on top of drm-tip. Updated functions intel_huc_load(), intel_huc_init() and intel_uc_fw_fetch() to accept dev_priv instead of dev. Moved contents of intel_huc.h to intel_uc.h. v9: change SKL_FW_ to SKL_HUC_FW_. Add intel_ prefix to guc_wopcm_size(). Remove unwanted checks in intel_uc.h. Rename huc_fw in struct intel_huc to simply fw to avoid redundency. v10: rebased. Correct comments. Make intel_huc_fini() accept dev_priv instead of dev like intel_huc_init() and intel_huc_load().Move definition to i915_guc_reg.h from intel_uc.h. Clean DMA_CTRL bits after HuC DMA transfer in huc_ucode_xfer() instead of guc_ucode_xfer(). Add suitable WARNs to give extra info. v11: rebased. Add proper bias for HuC and make sure there are asserts on failure by using guc_ggtt_offset_vma(). Introduce intel_huc.c and remove intel_huc_loader.c since it has functions that do more than just loading.Correct year in copyright. v12: remove invalidates that are not required anymore. Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Tested-by: NXiang Haihao <haihao.xiang@intel.com> Signed-off-by: NAnusha Srivatsa <anusha.srivatsa@intel.com> Signed-off-by: NAlex Dai <yu.dai@intel.com> Signed-off-by: NPeter Antoine <peter.antoine@intel.com> Reviewed-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1484755558-1234-1-git-send-email-anusha.srivatsa@intel.com
-
- 18 1月, 2017 1 次提交
-
-
由 Michal Wajdeczko 提交于
Functions supporting GuC logging capabilities were spread across many files, with unnecessary exposures and mixed with unrelated code. Dedicate file will make maintenance of all GuC functions easier as more functions are coming to support GuC submissions. Signed-off-by: NMichal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NArkadiusz Hiler <arkadiusz.hiler@intel.com> Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170113174157.104492-1-michal.wajdeczko@intel.com
-
- 13 12月, 2016 1 次提交
-
-
由 Tomeu Vizoso 提交于
In preparation to using a generic API in the DRM core for continuous CRC generation, move the related code out of i915_debugfs.c into a new file. Eventually, only the Intel-specific code will remain in this new file. v2: Rebased. v6: Rebased. v7: Fix whitespace issue. v9: Have intel_display_crc_init accept a drm_i915_private instead. v12: Rebased. Signed-off-by: NTomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: NEmil Velikov <emil.velikov@collabora.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1481545788-18194-1-git-send-email-tomeu.vizoso@collabora.com
-
- 26 11月, 2016 1 次提交
-
-
由 Arkadiusz Hiler 提交于
guc_send(), guc_recv() and related functions were introduced in the i915_guc_submission.c and their scope was limited only to that file. Those are not submission specific though. This patch moves moves them to intel_uc.c with intel_ prefix added. v2: rename intel_guc_log_* functions and clean up intel_guc_send usages Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Signed-off-by: NArkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1480096777-12573-4-git-send-email-arkadiusz.hiler@intel.comSigned-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 22 11月, 2016 2 次提交
-
-
由 Robert Bragg 提交于
Adds a static OA unit, MUX + B Counter configuration for basic render metrics on Haswell. This is auto generated from an XML description of metric sets, currently maintained in gputop, ref: https://github.com/rib/gputop > gputop-data/oa-*.xml > scripts/i915-perf-kernelgen.py $ make -C gputop-data -f Makefile.xml SYSFS=0 WHITELIST=RenderBasic Signed-off-by: NRobert Bragg <robert@sixbynine.org> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Reviewed-by: NSourab Gupta <sourab.gupta@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161107194957.3385-6-robert@sixbynine.org
-
由 Robert Bragg 提交于
Adds base i915 perf infrastructure for Gen performance metrics. This adds a DRM_IOCTL_I915_PERF_OPEN ioctl that takes an array of uint64 properties to configure a stream of metrics and returns a new fd usable with standard VFS system calls including read() to read typed and sized records; ioctl() to enable or disable capture and poll() to wait for data. A stream is opened something like: uint64_t properties[] = { /* Single context sampling */ DRM_I915_PERF_PROP_CTX_HANDLE, ctx_handle, /* Include OA reports in samples */ DRM_I915_PERF_PROP_SAMPLE_OA, true, /* OA unit configuration */ DRM_I915_PERF_PROP_OA_METRICS_SET, metrics_set_id, DRM_I915_PERF_PROP_OA_FORMAT, report_format, DRM_I915_PERF_PROP_OA_EXPONENT, period_exponent, }; struct drm_i915_perf_open_param parm = { .flags = I915_PERF_FLAG_FD_CLOEXEC | I915_PERF_FLAG_FD_NONBLOCK | I915_PERF_FLAG_DISABLED, .properties_ptr = (uint64_t)properties, .num_properties = sizeof(properties) / 16, }; int fd = drmIoctl(drm_fd, DRM_IOCTL_I915_PERF_OPEN, ¶m); Records read all start with a common { type, size } header with DRM_I915_PERF_RECORD_SAMPLE being of most interest. Sample records contain an extensible number of fields and it's the DRM_I915_PERF_PROP_SAMPLE_xyz properties given when opening that determine what's included in every sample. No specific streams are supported yet so any attempt to open a stream will return an error. v2: use i915_gem_context_get() - Chris Wilson v3: update read() interface to avoid passing state struct - Chris Wilson fix some rebase fallout, with i915-perf init/deinit v4: s/DRM_IORW/DRM_IOW/ - Emil Velikov Signed-off-by: NRobert Bragg <robert@sixbynine.org> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Reviewed-by: NSourab Gupta <sourab.gupta@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161107194957.3385-2-robert@sixbynine.org
-
- 11 11月, 2016 1 次提交
-
-
由 Joonas Lahtinen 提交于
As a side product, had to split two other files; - i915_gem_fence_reg.h - i915_gem_object.h (only parts that needed immediate untanglement) I tried to move code in as big chunks as possible, to make review easier. i915_vma_compare was moved to a header temporarily. v2: - Use i915_gem_fence_reg.{c,h} v3: - Rebased v4: - Fix building when DEBUG_GEM is enabled by reordering a bit. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com
-
- 02 11月, 2016 1 次提交
-
-
由 Mika Kuoppala 提交于
Create new file for hangcheck specific code, intel_hangcheck.c, and move all related code in it. v2: s/intel_engine_hangcheck/intel_engine (Chris) No functional changes. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NMika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1478018583-5816-1-git-send-email-mika.kuoppala@intel.com
-
- 29 10月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
Our timelines are more than just a seqno. They also provide an ordered list of requests to be executed. Due to the restriction of handling individual address spaces, we are limited to a timeline per address space but we use a fence context per engine within. Our first step to introducing independent timelines per context (i.e. to allow each context to have a queue of requests to execute that have a defined set of dependencies on other requests) is to provide a timeline abstraction for the global execution queue. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
Quite a few of our objects used for internal hardware programming do not benefit from being swappable or from being zero initialised. As such they do not benefit from using a shmemfs backing storage and since they are internal and never directly exposed to the user, we do not need to worry about providing a filp. For these we can use an drm_i915_gem_object wrapper around a sg_table of plain struct page. They are not swap backed and not automatically pinned. If they are reaped by the shrinker, the pages are released and the contents discarded. For the internal use case, this is fine as for example, ringbuffers are pinned from being written by a request to be read by the hardware. Once they are idle, they can be discarded entirely. As such they are a good match for execlist ringbuffers and a small variety of other internal objects. In the first iteration, this is limited to the scratch batch buffers we use (for command parsing and state initialisation). v2: Allocate physically contiguous pages, where possible. v3: Reduce maximum order on subsequent requests following an allocation failure. v4: Fix up mismatch between swiotlb segment size and page count (it counts in 2k units, not 4k pages) Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-7-chris@chris-wilson.co.uk
-
- 18 10月, 2016 1 次提交
-
-
由 Shashank Sharma 提交于
This patch adds a new file, to accommodate lspcon support for I915 driver. These functions probe, detect, initialize and configure an on-board lspcon device during the driver init time. Also, this patch adds a small structure for lspcon device, which will provide the runtime status of the device. V2: addressed ville's review comments - Clean the leftover macros from previous patch set V3: Rebase V4: addressed ville's review comments - make internal functions static - remove lspcon_detect_identifier, make it inline with lspcon_probe - remove is_lspcon_active function - remove force check while setting a lspcon mode V5: Rebase V6: Pass dev_priv to IS_GEN9 check Signed-off-by: NShashank Sharma <shashank.sharma@intel.com> Signed-off-by: NAkashdeep Sharma <akashdeep.sharma@intel.com> Reviewed-by: NRodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1476455212-27893-3-git-send-email-shashank.sharma@intel.com
-
- 12 10月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
We currently capture the GPU state after we detect a hang. This is vital for us to both triage and debug hangs in the wild (post-mortem debugging). However, it comes at the cost of running some potentially dangerous code (since it has to make very few assumption about the state of the driver) that is quite resource intensive. This patch introduces both a method to disable error capture at runtime (for users who hit bugs at runtime and need a workaround) and to disable error capture at compiletime (for realtime users who want to minimise any possible latency, and never require error capture, saving ~30k of code). The cost is that we now have to be wary of (and test!) a kconfig flag and a module parameter. The effect of the module parameter is easy to verify through code inspection and runtime testing, but a kconfig flag needs regular compile checking. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: NJani Nikula <jani.nikula@linux.intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-2-chris@chris-wilson.co.uk
-
- 09 9月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
This is really a core kernel struct in disguise until we can finally place it in kernel/. There is an immediate need for a fence collection mechanism that is more flexible than fence-array, in particular being able to easily drive request submission via events (and not just interrupt driven). The same mechanism would be useful for handling nonblocking and asynchronous atomic modesets, parallel execution and more, but for the time being just create a local sw fence for execbuf. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-1-chris@chris-wilson.co.uk
-
- 20 8月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
Very old numbers indicate this is a 66% improvement when remapping the entire object for fence contention - due to the elimination of track_pfn_insert and its strcmp. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Testcase: igt/gem_fence_upload/performance Testcase: igt/gem_mmap_gtt Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-6-chris@chris-wilson.co.uk
-
- 12 8月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
This patch provides the infrastructure for performing a 16-byte aligned read from WC memory using non-temporal instructions introduced with sse4.1. Using movntdqa we can bypass the CPU caches and read directly from memory and ignoring the page attributes set on the CPU PTE i.e. negating the impact of an otherwise UC access. Copying using movntdqa from WC is almost as fast as reading from WB memory, modulo the possibility of both hitting the CPU cache or leaving the data in the CPU cache for the next consumer. (The CPU cache itself my be flushed for the region of the movntdqa and on later access the movntdqa reads from a separate internal buffer for the cacheline.) The write back to the memory is however cached. This will be used in later patches to accelerate accessing WC memory. v2: Report whether the accelerated copy is successful/possible. v3: Function alignment override was only necessary when using the function target("sse4.1") - which is not necessary for emitting movntdqa from __asm__. v4: Improve notes on CPU cache behaviour vs non-temporal stores. v5: Fix byte offsets for unrolled moves. v6: Find all remaining typos of "movntqda", use kernel_fpu_begin. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Akash Goel <akash.goel@intel.com> Cc: Damien Lespiau <damien.lespiau@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1471001999-17787-2-git-send-email-chris@chris-wilson.co.uk
-
- 04 8月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
With the introduction of requests, we amplified the number of atomic refcounted objects we use and update every execbuffer; from none to several references, and a set of references that need to be changed. We also introduced interesting side-effects in the order of retiring requests and objects. Instead of independently tracking the last request for an object, track the active objects for each request. The object will reside in the buffer list of its most recent active request and so we reduce the kref interchange to a list_move. Now retirements are entirely driven by the request, dramatically simplifying activity tracking on the object themselves, and removing the ambiguity between retiring objects and retiring requests. Furthermore with the consolidation of managing the activity tracking centrally, we can look forward to using RCU to enable lockless lookup of the current active requests for an object. In the future, we will be able to query the status or wait upon rendering to an object without even touching the struct_mutex BKL. All told, less code, simpler and faster, and more extensible. v2: Add a typedef for the function pointer for convenience later. v3: Make the noop retirement callback explicit. Allow passing NULL to the init_request_active() which is expanded to a common noop function. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-16-git-send-email-chris@chris-wilson.co.uk
-
- 20 7月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
Migrate the request operations out of the main body of i915_gem.c and into their own C file for easier expansion. v2: Move __i915_add_request() across as well Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Acked-by: NMika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1469002875-2335-1-git-send-email-chris@chris-wilson.co.uk
-
- 14 7月, 2016 1 次提交
-
-
由 Tvrtko Ursulin 提交于
Common code deserves to be put in a separate file from legacy and execlists implementation for clarity and ease of maintenance. Signed-off-by: NTvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris-wilson.co.uk>
-
- 05 7月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
Let's reclaim a few hundred lines from i915_drv.c by splitting out the runtime configuration of the "constant" dev_priv->info. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-1-git-send-email-chris@chris-wilson.co.ukReviewed-by: NMatthew Auld <matthew.auld@intel.com>
-
- 02 7月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
One particularly stressful scenario consists of many independent tasks all competing for GPU time and waiting upon the results (e.g. realtime transcoding of many, many streams). One bottleneck in particular is that each client waits on its own results, but every client is woken up after every batchbuffer - hence the thunder of hooves as then every client must do its heavyweight dance to read a coherent seqno to see if it is the lucky one. Ideally, we only want one client to wake up after the interrupt and check its request for completion. Since the requests must retire in order, we can select the first client on the oldest request to be woken. Once that client has completed his wait, we can then wake up the next client and so on. However, all clients then incur latency as every process in the chain may be delayed for scheduling - this may also then cause some priority inversion. To reduce the latency, when a client is added or removed from the list, we scan the tree for completed seqno and wake up all the completed waiters in parallel. Using igt/benchmarks/gem_latency, we can demonstrate this effect. The benchmark measures the number of GPU cycles between completion of a batch and the client waking up from a call to wait-ioctl. With many concurrent waiters, with each on a different request, we observe that the wakeup latency before the patch scales nearly linearly with the number of waiters (before external factors kick in making the scaling much worse). After applying the patch, we can see that only the single waiter for the request is being woken up, providing a constant wakeup latency for every operation. However, the situation is not quite as rosy for many waiters on the same request, though to the best of my knowledge this is much less likely in practice. Here, we can observe that the concurrent waiters incur extra latency from being woken up by the solitary bottom-half, rather than directly by the interrupt. This appears to be scheduler induced (having discounted adverse effects from having a rbtree walk/erase in the wakeup path), each additional wake_up_process() costs approximately 1us on big core. Another effect of performing the secondary wakeups from the first bottom-half is the incurred delay this imposes on high priority threads - rather than immediately returning to userspace and leaving the interrupt handler to wake the others. To offset the delay incurred with additional waiters on a request, we could use a hybrid scheme that did a quick read in the interrupt handler and dequeued all the completed waiters (incurring the overhead in the interrupt handler, not the best plan either as we then incur GPU submission latency) but we would still have to wake up the bottom-half every time to do the heavyweight slow read. Or we could only kick the waiters on the seqno with the same priority as the current task (i.e. in the realtime waiter scenario, only it is woken up immediately by the interrupt and simply queues the next waiter before returning to userspace, minimising its delay at the expense of the chain, and also reducing contention on its scheduler runqueue). This is effective at avoid long pauses in the interrupt handler and at avoiding the extra latency in realtime/high-priority waiters. v2: Convert from a kworker per engine into a dedicated kthread for the bottom-half. v3: Rename request members and tweak comments. v4: Use a per-engine spinlock in the breadcrumbs bottom-half. v5: Fix race in locklessly checking waiter status and kicking the task on adding a new waiter. v6: Fix deciding when to force the timer to hide missing interrupts. v7: Move the bottom-half from the kthread to the first client process. v8: Reword a few comments v9: Break the busy loop when the interrupt is unmasked or has fired. v10: Comments, unnecessary churn, better debugging from Tvrtko v11: Wake all completed waiters on removing the current bottom-half to reduce the latency of waking up a herd of clients all waiting on the same request. v12: Rearrange missed-interrupt fault injection so that it works with igt/drv_missed_irq_hang v13: Rename intel_breadcrumb and friends to intel_wait in preparation for signal handling. v14: RCU commentary, assert_spin_locked v15: Hide BUG_ON behind the compiler; report on gem_latency findings. v16: Sort seqno-groups by priority so that first-waiter has the highest task priority (and so avoid priority inversion). v17: Add waiters to post-mortem GPU hang state. v18: Return early for a completed wait after acquiring the spinlock. Avoids adding ourselves to the tree if the is already complete, and skips the awkward question of why we don't do completion wakeups for waits earlier than or equal to ourselves. v19: Prepare for init_breadcrumbs to fail. Later patches may want to allocate during init, so be prepared to propagate back the error code. Testcase: igt/gem_concurrent_blit Testcase: igt/benchmarks/gem_latency Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com> Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: "Goel, Akash" <akash.goel@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> #v18 Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-6-git-send-email-chris@chris-wilson.co.uk
-
- 24 6月, 2016 2 次提交
-
-
由 Chris Wilson 提交于
To reclaim a bit of space from i915_drv.c, we can move the routines that just hook us into the PCI device tree into i915_pci.c Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1466773227-7994-14-git-send-email-chris@chris-wilson.co.uk
-
由 Chris Wilson 提交于
i915_dma.c used to contain the DRI1/UMS horror show, but now all that remains are the out-of-place driver level interfaces (such as allocating, initialising and registering the driver). These should be in i915_drv.c alongside similar routines for suspend/resume. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1466773227-7994-10-git-send-email-chris@chris-wilson.co.uk
-
- 18 6月, 2016 1 次提交
-
-
由 Zhi Wang 提交于
This patch introduces the very basic framework of GVT-g device model, includes basic prototypes, definitions, initialization. v12: - Call intel_gvt_init() in driver early initialization stage. (Chris) v8: - Remove the GVT idr and mutex in intel_gvt_host. (Joonas) v7: - Refine the URL link in Kconfig. (Joonas) - Refine the introduction of GVT-g host support in Kconfig. (Joonas) - Remove the macro GVT_ALIGN(), use round_down() instead. (Joonas) - Make "struct intel_gvt" a data member in struct drm_i915_private.(Joonas) - Remove {alloc, free}_gvt_device() - Rename intel_gvt_{create, destroy}_gvt_device() - Expost intel_gvt_init_host() - Remove the dummy "struct intel_gvt" declaration in intel_gvt.h (Joonas) v6: - Refine introduction in Kconfig. (Chris) - The exposed API functions will take struct intel_gvt * instead of void *. (Chris/Tvrtko) - Remove most memebers of strct intel_gvt_device_info. Will add them in the device model patches.(Chris) - Remove gvt_info() and gvt_err() in debug.h. (Chris) - Move GVT kernel parameter into i915_params. (Chris) - Remove include/drm/i915_gvt.h, as GVT-g will be built within i915. - Remove the redundant struct i915_gvt *, as the functions in i915 will directly take struct intel_gvt *. - Add more comments for reviewer. v5: Take Tvrtko's comments: - Fix the misspelled words in Kconfig - Let functions take drm_i915_private * instead of struct drm_device * - Remove redundant prints/local varible initialization v3: Take Joonas' comments: - Change file name i915_gvt.* to intel_gvt.* - Move GVT kernel parameter into intel_gvt.c - Remove redundant debug macros - Change error handling style - Add introductions for some stub functions - Introduce drm/i915_gvt.h. Take Kevin's comments: - Move GVT-g host/guest check into intel_vgt_balloon in i915_gem_gtt.c v2: - Introduce i915_gvt.c. It's necessary to introduce the stubs between i915 driver and GVT-g host, as GVT-g components is configurable in kernel config. When disabled, the stubs here do nothing. Take Joonas' comments: - Replace boolean return value with int. - Replace customized info/warn/debug macros with DRM macros. - Document all non-static functions like i915. - Remove empty and unused functions. - Replace magic number with marcos. - Set GVT-g in kernel config to "n" by default. Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Signed-off-by: NZhi Wang <zhi.a.wang@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1466078825-6662-5-git-send-email-zhi.a.wang@intel.comSigned-off-by: NChris Wilson <chris@chris-wilson.co.uk>
-
- 17 5月, 2016 1 次提交
-
-
由 Jani Nikula 提交于
If the source of the backlight PWM is from the panel then the PWM can be controlled by DCS command, this patch adds the support to enable/disbale panel PWM, control backlight level etc... v2: Moving the CABC bkl functions to new file.(Jani) v3: Rebase v4: Rebase v5: Use mipi_dsi_dcs_write() instead of mipi_dsi_dcs_write_buffer() (Jani) Move DCS macro`s to include/video/mipi_display.h (Jani) v6: Rename the file to intel_dsi_panel_pwm.c Removing the CABC operations v7 by Jani: renames, rebases, etc. v8 by Jani: s/INTEL_BACKLIGHT_CABC/INTEL_BACKLIGHT_DSI_DCS/ v9 by Jani: rename init function to intel_dsi_dcs_init_backlight_funcs Cc: Jani Nikula <jani.nikula@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Yetunde Adebisi <yetundex.adebisi@intel.com> Signed-off-by: NDeepak M <m.deepak@intel.com> Reviewed-by: NYetunde Adebisi <yetundex.adebisi@intel.com> Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/71238a4b14b8c3a6c04070c789f09f1b4bc00a15.1461676337.git.jani.nikula@intel.com
-
- 29 4月, 2016 1 次提交
-
-
The code for programming voltage swing and emphasis was duplicated between DP and HDMI code. Move that to a new file, intel_dpio_phy.c. v2: Keep the "Use 800mV-0dB" comment in the HDMI code. (Ville) Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: NJim Bride <jim.bride@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-3-git-send-email-ander.conselvan.de.oliveira@intel.com
-
- 26 4月, 2016 1 次提交
-
-
由 Yetunde Adebisi 提交于
This patch adds support for eDP backlight control using DPCD registers to backlight hooks in intel_panel. It checks for backlight control over AUX channel capability and sets up function pointers to get and set the backlight brightness level if supported. v2: Moved backlight functions from intel_dp.c into a new file intel_dp_aux_backlight.c. Also moved reading of eDP display control registers to intel_dp_get_dpcd v3: Correct some formatting mistakes v4: Updated to use AUX backlight control if PWM control is not possible (Jani) v5: Moved call to initialize backlight registers to dp_aux_setup_backlight v6: Check DP_EDP_BACKLIGHT_PIN_ENABLE_CAP is disabled before setting up AUX backlight control. To fix BLM_PWM_ENABLE igt test warnings on bdw_ultra v7: Add enable_dpcd_backlight module parameter. v8: Rebase onto latest drm-intel-nightly branch v9: Remove changes to intel_dp_dpcd_read_wake Split addition edp_dpcd variable into a separate patch Cc: Bob Paauwe <bob.j.paauwe@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: NYetunde Adebisi <yetundex.adebisi@intel.com> [Jani: whitepace changes to appease checkpatch] Signed-off-by: NJani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459865452-9138-4-git-send-email-yetundex.adebisi@intel.com
-
- 14 4月, 2016 1 次提交
-
-
由 Chris Wilson 提交于
Our driver compiles clean (nowadays thanks to 0day) but for me, at least, it would be beneficial if the compiler threw an error rather than a warning when it found a piece of suspect code. (I use this to compile-check patch series and want to break on the first compiler error in order to fix the patch.) v2: Kick off a new "Debugging" submenu for i915.ko At this point, we applied it to the kernel and promptly kicked it out again as it broke buildbots (due to a compiler warning on 32bits): commit 908d759b Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue May 26 07:46:21 2015 +0200 Revert "drm/i915: Force clean compilation with -Werror" v3: Avoid enabling -Werror for allyesconfig/allmodconfig builds, using COMPILE_TEST as a suitable proxy suggested by Andrew Morton. (Damien) Only make the option available for EXPERT to reinforce that the option should not be casually enabled. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-1-git-send-email-chris@chris-wilson.co.uk
-
- 22 3月, 2016 1 次提交
-
-
由 Lionel Landwerlin 提交于
The moves a couple of functions programming the gamma LUT and CSC units into their own file. On generations prior to Haswell there is only a gamma LUT. From haswell on there is also a new enhanced color correction unit that isn't used yet. This is why we need to set the GAMMA_MODE register, either we're using the legacy 8bits LUT or enhanced LUTs (of 10 or 12bits). The CSC unit is only available from Haswell on. We also need to make a special case for CherryView which is recognized as a gen 8 but doesn't have the same enhanced color correction unit from Haswell on. v2: Fix access to GAMMA_MODE register on older generations than Haswell (from Matt Roper's comments) Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: NMatt Roper <matthew.d.roper@intel.com> Signed-off-by: NMatt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1458125837-2576-2-git-send-email-lionel.g.landwerlin@intel.com
-
- 09 3月, 2016 1 次提交
-
-
Create the new file intel_dpll_mgr.c and move the shared dpll code to it. Follow up patches that reorganize pll handling will move more code there and tweak the interface. No functional changes. Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1457451987-17466-2-git-send-email-ander.conselvan.de.oliveira@intel.com
-
- 05 11月, 2015 1 次提交
-
-
No functional changes, just moving code around. v2: Rebase Signed-off-by: NAnder Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: NSivakumar Thulasimani <sivakumar.thulasimani@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1445594525-7174-6-git-send-email-ander.conselvan.de.oliveira@intel.com
-
- 15 8月, 2015 2 次提交
-
-
由 Alex Dai 提交于
This adds the first of the data structures used to communicate with the GuC (the pool of guc_context structures). We create a GuC-specific wrapper round the GEM object allocator as all GEM objects shared with the GuC must be pinned into GGTT space at an address that is NOT in the range [0..WOPCM_TOP), as that range of GGTT addresses is not accessible to the GuC (from the GuC's point of view, it's permanently reserved for other objects such as the BootROM & SRAM). Later, we will need to allocate additional GuC-sharable objects for the submission client(s) and the GuC's debug log. v2: Remove redundant initialisation [Chris Wilson] Defer adding struct members until needed [Chris Wilson] Local functions should pass dev_priv rather than dev [Chris Wilson] v5: Invalidate GuC TLB after allocating and pinning a new object v6: Rebased Issue: VIZ-4884 Signed-off-by: NAlex Dai <yu.dai@intel.com> Signed-off-by: NDave Gordon <david.s.gordon@intel.com> Reviewed-by: NTom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
由 Alex Dai 提交于
This fetches the required firmware image from the filesystem, then loads it into the GuC's memory via a dedicated DMA engine. This patch is derived from GuC loading work originally done by Vinit Azad and Ben Widawsky. v2: Various improvements per review comments by Chris Wilson v3: Removed 'wait' parameter to intel_guc_ucode_load() as firmware prefetch is no longer supported in the common firmware loader, per Daniel Vetter's request. Firmware checker callback fn now returns errno rather than bool. v4: Squash uC-independent code into GuC-specifc loader [Daniel Vetter] Don't keep the driver working (by falling back to execlist mode) if GuC firmware loading fails [Daniel Vetter] v5: Clarify WOPCM-related #defines [Tom O'Rourke] Delete obsolete code no longer required with current h/w & f/w [Tom O'Rourke] Move the call to intel_guc_ucode_init() later, so that it can allocate GEM objects, and have it fetch the firmware; then intel_guc_ucode_load() doesn't need to fetch it later. [Daniel Vetter]. v6: Update comment describing intel_guc_ucode_load() [Tom O'Rourke] Issue: VIZ-4884 Signed-off-by: NAlex Dai <yu.dai@intel.com> Signed-off-by: NDave Gordon <david.s.gordon@intel.com> Reviewed-by: NTom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-
- 11 8月, 2015 1 次提交
-
-
由 Daniel Vetter 提交于
Instead of our own duplicated one. This fixes a bug in the driver unload code if DRM_FBDEV_EMULATION=n but DRM_I915_FBDEV=y because we try to unregister the nonexistent fbdev drm_framebuffer. Cc: Archit Taneja <architt@codeaurora.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reported-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
-