- 03 7月, 2017 1 次提交
-
-
由 Gustavo Padovan 提交于
In some cases, like cursor updates, it is interesting to update the plane in an asynchronous fashion to avoid big delays. The current queued update could be still waiting for a fence to signal and thus block any subsequent update until its scan out. In cases like this if we update the cursor synchronously through the atomic API it will cause significant delays that would even be noticed by the final user. This patch creates a fast path to jump ahead the current queued state and do single planes updates without going through all atomic steps in drm_atomic_helper_commit(). We take this path for legacy cursor updates. For now only single plane updates are supported, but we plan to support multiple planes updates and async PageFlips through this interface as well in the near future. v6: - move check code to drm_atomic_helper.c (Daniel Vetter) v5: - improve comments (Eric Anholt) v4: - fix state->crtc NULL check (Archit Taneja) v3: - fix iteration on the wrong crtc state - put back code to forbid updates if there is a queued update for the same plane (Ville Syrjälä) - move size checks back to drivers (Ville Syrjälä) - move ASYNC_UPDATE flag addition to its own patch (Ville Syrjälä) v2: - allow updates even if there is a queued update for the same plane. - fixes on the documentation (Emil Velikov) - unconditionally call ->atomic_async_update (Emil Velikov) - check for ->atomic_async_update earlier (Daniel Vetter) - make ->atomic_async_check() the last step (Daniel Vetter) - add ASYNC_UPDATE flag (Eric Anholt) - update state in core after ->atomic_async_update (Eric Anholt) - update docs (Eric Anholt) Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Eric Anholt <eric@anholt.net> Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.com> Reviewed-by: Archit Taneja <architt@codeaurora.org> (v5) Acked-by: Eric Anholt <eric@anholt.net> (v5) Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170630180322.29007-2-gustavo@padovan.org
-
- 30 6月, 2017 2 次提交
-
-
由 Laurent Pinchart 提交于
The old state is useful for drivers that need to perform operations at enable time that depend on the transition between the old and new states. While at it, rename the operation to .atomic_enable() to be consistent with .atomic_disable(), as the .enable() operation is used by atomic helpers only. Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com> # for sun4i Acked-by: Philipp Zabel <p.zabel@pengutronix.de> # for imx-drm and mediatek Acked-by: Alexey Brodkin <abrodkin@synopsys.com> # for arcpgu Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> # for atmel-hlcdc Acked-by: Liviu Dudau <Liviu.Dudau@arm.com> # for hdlcd and mali-dp Acked-by: Stefan Agner <stefan@agner.ch> # for fsl-dcu Tested-by: Philippe Cornu <philippe.cornu@st.com> # for stm Acked-by: Philippe Cornu <philippe.cornu@st.com> # for stm Acked-by: Vincent Abriou <vincent.abriou@st.com> # for sti Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> # for vmwgfx Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170630093646.7928-2-laurent.pinchart+renesas@ideasonboard.com
-
由 Chris Wilson 提交于
Often we have the task of comparing two seqno known to be on the same context, so provide a common __dma_fence_is_later(). Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Sean Paul <seanpaul@chromium.org> Cc: Gustavo Padovan <gustavo@padovan.org> Reviewed-by: NSean Paul <seanpaul@chromium.org> Signed-off-by: NGustavo Padovan <gustavo.padovan@collabora.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170629125930.821-1-chris@chris-wilson.co.uk
-
- 28 6月, 2017 1 次提交
-
-
由 Daniel Vetter 提交于
There's no reason for drivers to call this, and all the ones I've removed looked very fishy: - Proper quiescenting of the vblank machinery should be done by calling drm_crtc_vblank_off(), which is best done by shutting down the entire display engine with drm_atomic_helper_shutdown. - Releasing of allocated memory is done by the core already, it calls drm_vblank_cleanup as a fallback. - drm_vblank_cleanup also has checks for drivers which forget to clean up vblank interrupts. This essentially reverts commit e77cef9c Author: Jerome Glisse <jglisse@redhat.com> Date: Thu Jan 7 15:39:13 2010 +0100 drm: Avoid calling vblank function is vblank wasn't initialized which was done to fix a bug in radeon code with msi interrupts: commit 003e69f9 Author: Jerome Glisse <jglisse@redhat.com> Date: Thu Jan 7 15:39:14 2010 +0100 drm/radeon/kms: Don't try to enable IRQ if we have no handler installed Afaict from digging around in old code, this was needed to avoid blowing up in the ums fallback, and has stopped serving it's purpose long ago - if irq init fails, the driver fails to load, and there's really no way to blow up anymore. Long story short, this was most likely a small ums compat/fallback hack that became a thing of it's own and got cargo-cult duplicated all over the drm codebase for essentially no gain at all. v2: Mention that for drivers with a ->release callback cleanup is handled by drm_dev_fini() (Thierry). Cc: Thierry Reding <treding@nvidia.com> Acked-by: NThierry Reding <treding@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Reviewed-by: NSean Paul <seanpaul@chromium.org> Acked-by: NAlex Deucher <alexander.deucher@amd.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170626161949.25629-2-daniel.vetter@ffwll.ch
-
- 24 6月, 2017 1 次提交
-
-
由 Tejun Heo 提交于
Commit bf5eb3de ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()") made slub sysfs file removals synchronous to kmem_cache shutdown. Unfortunately, this created a possible ABBA deadlock between slab_mutex and sysfs draining mechanism triggering the following lockdep warning. ====================================================== [ INFO: possible circular locking dependency detected ] 4.10.0-test+ #48 Not tainted ------------------------------------------------------- rmmod/1211 is trying to acquire lock: (s_active#120){++++.+}, at: [<ffffffff81308073>] kernfs_remove+0x23/0x40 but task is already holding lock: (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (slab_mutex){+.+.+.}: lock_acquire+0xf6/0x1f0 __mutex_lock+0x75/0x950 mutex_lock_nested+0x1b/0x20 slab_attr_store+0x75/0xd0 sysfs_kf_write+0x45/0x60 kernfs_fop_write+0x13c/0x1c0 __vfs_write+0x28/0x120 vfs_write+0xc8/0x1e0 SyS_write+0x49/0xa0 entry_SYSCALL_64_fastpath+0x1f/0xc2 -> #0 (s_active#120){++++.+}: __lock_acquire+0x10ed/0x1260 lock_acquire+0xf6/0x1f0 __kernfs_remove+0x254/0x320 kernfs_remove+0x23/0x40 sysfs_remove_dir+0x51/0x80 kobject_del+0x18/0x50 __kmem_cache_shutdown+0x3e6/0x460 kmem_cache_destroy+0x1fb/0x2d0 kvm_exit+0x2d/0x80 [kvm] vmx_exit+0x19/0xa1b [kvm_intel] SyS_delete_module+0x198/0x1f0 entry_SYSCALL_64_fastpath+0x1f/0xc2 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(slab_mutex); lock(s_active#120); lock(slab_mutex); lock(s_active#120); *** DEADLOCK *** 2 locks held by rmmod/1211: #0: (cpu_hotplug.dep_map){++++++}, at: [<ffffffff810a7877>] get_online_cpus+0x37/0x80 #1: (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0 stack backtrace: CPU: 3 PID: 1211 Comm: rmmod Not tainted 4.10.0-test+ #48 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 Call Trace: print_circular_bug+0x1be/0x210 __lock_acquire+0x10ed/0x1260 lock_acquire+0xf6/0x1f0 __kernfs_remove+0x254/0x320 kernfs_remove+0x23/0x40 sysfs_remove_dir+0x51/0x80 kobject_del+0x18/0x50 __kmem_cache_shutdown+0x3e6/0x460 kmem_cache_destroy+0x1fb/0x2d0 kvm_exit+0x2d/0x80 [kvm] vmx_exit+0x19/0xa1b [kvm_intel] SyS_delete_module+0x198/0x1f0 ? SyS_delete_module+0x5/0x1f0 entry_SYSCALL_64_fastpath+0x1f/0xc2 It'd be the cleanest to deal with the issue by removing sysfs files without holding slab_mutex before the rest of shutdown; however, given the current code structure, it is pretty difficult to do so. This patch punts sysfs file removal to a work item. Before commit bf5eb3de, the removal was punted to a RCU delayed work item which is executed after release. Now, we're punting to a different work item on shutdown which still maintains the goal removing the sysfs files earlier when destroying kmem_caches. Link: http://lkml.kernel.org/r/20170620204512.GI21326@htj.duckdns.org Fixes: bf5eb3de ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()") Signed-off-by: NTejun Heo <tj@kernel.org> Reported-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Tested-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 23 6月, 2017 1 次提交
-
-
由 Gerd Hoffmann 提交于
Drop them from u64 fields, tag local variables correctly instead. While being at it switch the code to use u64_to_user_ptr(). Signed-off-by: NGerd Hoffmann <kraxel@redhat.com> Acked-by: NDaniel Vetter <daniel@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170620113916.6967-2-kraxel@redhat.com
-
- 22 6月, 2017 3 次提交
-
-
由 Boris Brezillon 提交于
Add an helper to wait for all page flips of an atomic state to be done. v2: - Pimp kerneldoc as discussed with Boris on irc - Add missing doc for @dev. - Use old_state for consitency with wait_for_vblanks Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> (v1) Acked-by: NBoris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1496392332-8722-2-git-send-email-boris.brezillon@free-electrons.com
-
由 Jarkko Nikula 提交于
Commit f406270b ("ACPI / scan: Set the visited flag for all enumerated devices") caused that two group of special SPI or I2C devices do not enumerate. SPI and I2C devices are expected to be enumerated by the SPI and I2C subsystems but change caused that acpi_bus_attach() marks those devices with acpi_device_set_enumerated(). First group of devices are matched using Device Tree compatible property with special _HID "PRP0001". Those devices have matched scan handler, acpi_scan_attach_handler() retuns 1 and acpi_bus_attach() marks them with acpi_device_set_enumerated(). Second group of devices without valid _HID such as "LNXVIDEO" have device->pnp.type.platform_id set to zero and change again marks them with acpi_device_set_enumerated(). Fix this by flagging the SPI and I2C devices during struct acpi_device object initialization time and let the code in acpi_bus_attach() to go through the device_attach() and acpi_default_enumeration() path for all SPI and I2C devices. Fixes: f406270b (ACPI / scan: Set the visited flag for all enumerated devices) Signed-off-by: NJarkko Nikula <jarkko.nikula@linux.intel.com> Acked-by: NMika Westerberg <mika.westerberg@linux.intel.com> Cc: 4.11+ <stable@vger.kernel.org> # 4.11+ Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-
由 Jens Axboe 提交于
If we have shared tags enabled, then every IO completion will trigger a full loop of every queue belonging to a tag set, and every hardware queue for each of those queues, even if nothing needs to be done. This causes a massive performance regression if you have a lot of shared devices. Instead of doing this huge full scan on every IO, add an atomic counter to the main queue that tracks how many hardware queues have been marked as needing a restart. With that, we can avoid looking for restartable queues, if we don't have to. Max reports that this restores performance. Before this patch, 4K IOPS was limited to 22-23K IOPS. With the patch, we are running at 950-970K IOPS. Fixes: 6d8c6c0f ("blk-mq: Restart a single queue if tag sets are shared") Reported-by: NMax Gurtovoy <maxg@mellanox.com> Tested-by: NMax Gurtovoy <maxg@mellanox.com> Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Tested-by: NBart Van Assche <bart.vanassche@wdc.com> Signed-off-by: NJens Axboe <axboe@kernel.dk>
-
- 20 6月, 2017 8 次提交
-
-
由 Liviu Dudau 提交于
drm_fbdev_cma_set_suspend{,_unlocked} use an integer parameter to describe whether the intended state is a suspend or a resume. It then passes the value to drm_fb_helper_set_suspend{,_unlocked} which uses a boolean. Switch to using bool everywhere. Signed-off-by: NLiviu Dudau <liviu.dudau@arm.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170620102320.8849-1-Liviu.Dudau@arm.com
-
由 John Stultz 提交于
Due to how the MONOTONIC_RAW accumulation logic was handled, there is the potential for a 1ns discontinuity when we do accumulations. This small discontinuity has for the most part gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW in their vDSO clock_gettime implementation, we've seen failures with the inconsistency-check test in kselftest. This patch addresses the issue by using the same sub-ns accumulation handling that CLOCK_MONOTONIC uses, which avoids the issue for in-kernel users. Since the ARM64 vDSO implementation has its own clock_gettime calculation logic, this patch reduces the frequency of errors, but failures are still seen. The ARM64 vDSO will need to be updated to include the sub-nanosecond xtime_nsec values in its calculation for this issue to be completely fixed. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Tested-by: NDaniel Mentz <danielmentz@google.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: "stable #4 . 8+" <stable@vger.kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 John Stultz 提交于
In tests, which excercise switching of clocksources, a NULL pointer dereference can be observed on AMR64 platforms in the clocksource read() function: u64 clocksource_mmio_readl_down(struct clocksource *c) { return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask; } This is called from the core timekeeping code via: cycle_now = tkr->read(tkr->clock); tkr->read is the cached tkr->clock->read() function pointer. When the clocksource is changed then tkr->clock and tkr->read are updated sequentially. The code above results in a sequential load operation of tkr->read and tkr->clock as well. If the store to tkr->clock hits between the loads of tkr->read and tkr->clock, then the old read() function is called with the new clock pointer. As a consequence the read() function dereferences a different data structure and the resulting 'reg' pointer can point anywhere including NULL. This problem was introduced when the timekeeping code was switched over to use struct tk_read_base. Before that, it was theoretically possible as well when the compiler decided to reload clock in the code sequence: now = tk->clock->read(tk->clock); Add a helper function which avoids the issue by reading tk_read_base->clock once into a local variable clk and then issue the read function via clk->read(clk). This guarantees that the read() function always gets the proper clocksource pointer handed in. Since there is now no use for the tkr.read pointer, this patch also removes it, and to address stopping the fast timekeeper during suspend/resume, it introduces a dummy clocksource to use rather then just a dummy read function. Signed-off-by: NJohn Stultz <john.stultz@linaro.org> Acked-by: NIngo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: stable <stable@vger.kernel.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Daniel Mentz <danielmentz@google.com> Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.orgSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
-
由 Daniel Vetter 提交于
I spotted a markup issue, plus adding the descriptions in drm_driver. Plus a few more links while at it. I'm still mildly unhappy with the split between fops and ioctls, but I still think having the ioctls in the uapi chapter makes more sense. Oh well ... v2: Rebase. v3: Move misplace hunk to the right patch. Cc: Stefan Agner <stefan@agner.ch> Acked-by: NThierry Reding <treding@nvidia.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170531092045.3950-1-daniel.vetter@ffwll.ch
-
由 Daniel Vetter 提交于
The magic switching between proper pci driver and shadow-attach isn't useful anymore since there's no ums+kms drivers left. Let's split this up properly, calling pci_register_driver for kms drivers and renaming the shadow-attach init to drm_legacy_pci_init/exit. Acked-by: NThierry Reding <treding@nvidia.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-6-daniel.vetter@ffwll.ch
-
由 Daniel Vetter 提交于
The only special-case is pci devices, and we can easily handle this in the core. Do so and drop a pile of boilerplate from drivers. Acked-by: NThierry Reding <treding@nvidia.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-5-daniel.vetter@ffwll.ch
-
由 Daniel Vetter 提交于
We use drm_crtc_ for all the new-style vblank functions which directly take a struct drm_crtc *. drm_accurate_vblank_count was the odd one out, correct this to appease my OCD. Acked-by: NThierry Reding <treding@nvidia.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-13-daniel.vetter@ffwll.ch
-
由 Daniel Vetter 提交于
Unify and review everything, plus make sure it's all correct markup. Drop the kernel-doc for internal functions. Also rework the overview section, it's become rather outdated. Unfortuantely the kernel-doc in drm_driver isn't rendered yet, but that will change as soon as drm_driver is kernel-docified properly. Also document properly that drm_vblank_cleanup is optional, the core calls this already. v2: Make it clear that cleanup happens in drm_dev_fini for drivers with their own ->release callback (Thierry). Acked-by: NThierry Reding <treding@nvidia.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170524145212.27837-11-daniel.vetter@ffwll.ch
-
- 19 6月, 2017 1 次提交
-
-
由 Hugh Dickins 提交于
Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: NOleg Nesterov <oleg@redhat.com> Original-patch-by: NMichal Hocko <mhocko@suse.com> Signed-off-by: NHugh Dickins <hughd@google.com> Acked-by: NMichal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 6月, 2017 1 次提交
-
-
由 Dave Airlie 提交于
This creates a new command submission chunk for amdgpu to add in and out sync objects around the submission. Sync objects are managed via the drm syncobj ioctls. The command submission interface is enhanced with two new chunks, one for syncobj pre submission dependencies, and one for post submission sync obj signalling, and just takes a list of handles for each. This is based on work originally done by David Zhou at AMD, with input from Christian Konig on what things should look like. In theory VkFences could be backed with sync objects and just get passed into the cs as syncobj handles as well. NOTE: this interface addition needs a version bump to expose it to userspace. TODO: update to dep_sync when rebasing onto amdgpu master. (with this - r-b from Christian) v1.1: keep file reference on import. v2: move to using syncobjs v2.1: change some APIs to just use p pointer. v3: make more robust against CS failures, we now add the wait sems but only remove them once the CS job has been submitted. v4: rewrite names of API and base on new syncobj code. v5: move post deps earlier, rename some apis v6: lookup post deps earlier, and just replace fences in post deps stage (Christian) Reviewed-by: NChristian König <christian.koenig@amd.com> Signed-off-by: NDave Airlie <airlied@redhat.com> Signed-off-by: NAlex Deucher <alexander.deucher@amd.com>
-
- 16 6月, 2017 5 次提交
-
-
由 Chris Wilson 提交于
Currently, the last object in the execlist is the always the batch. However, when building the batch buffer we often know the batch object first and if we can use the first slot in the execlist we can emit relocation instructions relative to it immediately and avoid a separate pass to adjust the relocations to point to the last execlist slot. Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk> Reviewed-by: NJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
由 Jordan Crouse 提交于
Modify the 'pad' member of struct drm_msm_gem_info to 'flags'. If the user sets 'flags' to non-zero it means that they want a IOVA for the GEM object instead of a mmap() offset. Return the iova in the 'offset' member. Signed-off-by: NJordan Crouse <jcrouse@codeaurora.org> [robclark: s/hint/flags in commit msg] Signed-off-by: NRob Clark <robdclark@gmail.com>
-
由 Jordan Crouse 提交于
The ioctl array is sparsely populated but the compiler will make sure that it is sufficiently sized for all the values that we have so we can safely use ARRAY_SIZE() instead of having a constantly changing #define in the uapi header. Signed-off-by: NJordan Crouse <jcrouse@codeaurora.org> Signed-off-by: NRob Clark <robdclark@gmail.com>
-
由 Eric Anholt 提交于
This allows mesa to set the tiling format for a BO and have that tiling format be respected by mesa on the other side of an import/export (and by vc4 scanout in the kernel), without defining a protocol to pass the tiling through userspace. Signed-off-by: NEric Anholt <eric@anholt.net> Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-2-eric@anholt.netAcked-by: NDave Airlie <airlied@redhat.com>
-
由 Eric Anholt 提交于
The T tiling format is what V3D uses for textures, with no raster support at all until later revisions of the hardware (and always at a large 3D performance penalty). If we can't scan out V3D's format, then we often need to do a relayout at some stage of the pipeline, either right before texturing from the scanout buffer (common in X11 without a compositor) or between a tiled screen buffer right before scanout (an option I've considered in trying to resolve this inconsistency, but which means needing to use the dirty fb ioctl and having some update policy). T-format scanout lets us avoid either of those shadow copies, for a massive, obvious performance improvement to X11 window dragging without a compositor. Unfortunately, enabling a compositor to work around the discrepancy has turned out to be too costly in memory consumption for the Raspbian distribution. Because the HVS operates a scanline at a time, compositing from T does increase the memory bandwidth cost of scanout. On my 1920x1080@32bpp display on a RPi3, we go from about 15% of system memory bandwidth with linear to about 20% with tiled. However, for X11 this still ends up being a huge performance win in active usage. This patch doesn't yet handle src_x/src_y offsetting within the tiled buffer. However, we fail to do so for untiled buffers already. Signed-off-by: NEric Anholt <eric@anholt.net> Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-1-eric@anholt.netReviewed-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
-
- 15 6月, 2017 12 次提交
-
-
由 Mikko Perttunen 提交于
This is largely a rewrite of the Host1x channel allocation code, bringing several changes: - The previous code could deadlock due to an interaction between the 'reflock' mutex and CDMA timeout handling. This gets rid of the mutex. - Support for more than 32 channels, required for Tegra186 - General refactoring, including better encapsulation of channel ownership handling into channel.c Signed-off-by: NMikko Perttunen <mperttunen@nvidia.com> Reviewed-by: NDmitry Osipenko <digetx@gmail.com> Tested-by: NDmitry Osipenko <digetx@gmail.com> Signed-off-by: NThierry Reding <treding@nvidia.com>
-
由 Dmitry Osipenko 提交于
Arguments of the .is_addr_reg() are swapped in the definition of the function, that is quite confusing. Signed-off-by: NDmitry Osipenko <digetx@gmail.com> Reviewed-by: NErik Faye-Lund <kusmabite@gmail.com> Reviewed-by: NMikko Perttunen <mperttunen@nvidia.com> Signed-off-by: NThierry Reding <treding@nvidia.com>
-
由 Dmitry Osipenko 提交于
Several channels could be made to write the same unit concurrently via the SETCLASS opcode, trusting userspace is a bad idea. It should be possible to drop the per-client channel reservation and add a per-unit locking by inserting MLOCK's to the command stream to re-allow the SETCLASS opcode, but it will be much more work. Let's forbid the unit-unrelated class changes for now. Signed-off-by: NDmitry Osipenko <digetx@gmail.com> Reviewed-by: NErik Faye-Lund <kusmabite@gmail.com> Reviewed-by: NMikko Perttunen <mperttunen@nvidia.com> Signed-off-by: NThierry Reding <treding@nvidia.com>
-
由 Dmitry Osipenko 提交于
The waitchecks along with multiple syncpoints per submit are not ready for use yet, let's forbid them for now. Signed-off-by: NDmitry Osipenko <digetx@gmail.com> Reviewed-by: NMikko Perttunen <mperttunen@nvidia.com> Signed-off-by: NThierry Reding <treding@nvidia.com>
-
由 Thierry Reding 提交于
Improve kerneldoc for the public parts of the host1x infrastructure in preparation for adding driver-specific part to the GPU documentation. Acked-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: NThierry Reding <treding@nvidia.com>
-
由 Andy Lutomirski 提交于
Currently they return -1 on error, which will confuse callers if they try to interpret it as a normal negative error code. Signed-off-by: NAndy Lutomirski <luto@kernel.org> Signed-off-by: NDarren Hart (VMware) <dvhart@infradead.org> Signed-off-by: NJean Delvare <jdelvare@suse.de>
-
由 Dawid Kurek 提交于
Forward declarations in C are great but I'm pretty sure one is enough. Signed-off-by: NDawid Kurek <dawikur@gmail.com> Signed-off-by: NSean Paul <seanpaul@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170614213518.GA3554@gmail.com
-
由 Robert Bragg 提交于
Enables access to OA unit metrics for BDW, CHV, SKL and BXT which all share (more-or-less) the same OA unit design. Of particular note in comparison to Haswell: some OA unit HW config state has become per-context state and as a consequence it is somewhat more complicated to manage synchronous state changes from the cpu while there's no guarantee of what context (if any) is currently actively running on the gpu. The periodic sampling frequency which can be particularly useful for system-wide analysis (as opposed to command stream synchronised MI_REPORT_PERF_COUNT commands) is perhaps the most surprising state to have become per-context save and restored (while the OABUFFER destination is still a shared, system-wide resource). This support for gen8+ takes care to consider a number of timing challenges involved in synchronously updating per-context state primarily by programming all config state from the cpu and updating all current and saved contexts synchronously while the OA unit is still disabled. The driver intentionally avoids depending on command streamer programming to update OA state considering the lack of synchronization between the automatic loading of OACTXCONTROL state (that includes the periodic sampling state and enable state) on context restore and the parsing of any general purpose BB the driver can control. I.e. this implementation is careful to avoid the possibility of a context restore temporarily enabling any out-of-date periodic sampling state. In addition to the risk of transiently-out-of-date state being loaded automatically; there are also internal HW latencies involved in the loading of MUX configurations which would be difficult to account for from the command streamer (and we only want to enable the unit when once the MUX configuration is complete). Since the Gen8+ OA unit design no longer supports clock gating the unit off for a single given context (which effectively stopped any progress of counters while any other context was running) and instead supports tagging OA reports with a context ID for filtering on the CPU, it means we can no longer hide the system-wide progress of counters from a non-privileged application only interested in metrics for its own context. Although we could theoretically try and subtract the progress of other contexts before forwarding reports via read() we aren't in a position to filter reports captured via MI_REPORT_PERF_COUNT commands. As a result, for Gen8+, we always require the dev.i915.perf_stream_paranoid to be unset for any access to OA metrics if not root. v5: Drain submitted requests when enabling metric set to ensure no lite-restore erases the context image we just updated (Lionel) v6: In addition to drain, switch to kernel context & update all context in place (Chris) v7: Add missing mutex_unlock() if switching to kernel context fails (Matthew) v8: Simplify OA period/flex-eu-counters programming by using the batchbuffer instead of modifying ctx-image (Lionel) v9: Back to updating the context image (due to erroneous testing, batchbuffer programming the OA unit doesn't actually work) (Lionel) Pin context before updating context image (Chris) Drop MMIO programming now that we switch to a kernel context with right values in initial context image (Chris) v10: Just pin_map the contexts we want to modify or let the configuration happen on first use (Chris) v11: Update kernel context OA config through the batchbuffer rather than on the fly ctx-image update (Lionel) v12: Rework OA context registers update again by swithing away from user contexts and reconfiguring the kernel context through the batchbuffer and updating all the other contexts' context image. Also take care to lock slice/subslice configuration when OA is on. (Lionel) v13: Request rpcs updates on all engine when updating the OA config (Lionel) v14: Drop any kind of rpcs management now that we monitor sseu configuration changes in a later patch (Lionel) Remove usleep after programming the NOA configs on Gen8+, this doesn't seem to be needed (Lionel) v15: Respect coding style for block comments (Chris) v16: Add missing i915_add_request() in case we fail to emit OA configuration (Matthew) Signed-off-by: NRobert Bragg <robert@sixbynine.org> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> \o/ Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
-
由 Robert Bragg 提交于
Assuming a uniform mask across all slices, this enables userspace to determine the specific sub slices can be enabled. This information is required, for example, to be able to analyse some OA counter reports where the counter configuration depends on the HW sub slice configuration. Signed-off-by: NRobert Bragg <robert@sixbynine.org> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
-
由 Robert Bragg 提交于
Enables userspace to determine the maximum number of slices that can be enabled on the device and also know what specific slices can be enabled. This information is required, for example, to be able to analyse some OA counter reports where the counter configuration depends on the HW slice configuration. Signed-off-by: NRobert Bragg <robert@sixbynine.org> Reviewed-by: NMatthew Auld <matthew.auld@intel.com> Signed-off-by: NLionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: NBen Widawsky <ben@bwidawsk.net>
-
由 Bart Van Assche 提交于
Avoid that the following complaint is reported: BUG: sleeping function called from invalid context at kernel/workqueue.c:2790 in_atomic(): 1, irqs_disabled(): 0, pid: 41, name: rcuop/3 1 lock held by rcuop/3/41: #0: (rcu_callback){......}, at: [<ffffffff8111f9a2>] rcu_nocb_kthread+0x282/0x500 Call Trace: dump_stack+0x86/0xcf ___might_sleep+0x174/0x260 __might_sleep+0x4a/0x80 flush_work+0x7e/0x2e0 __cancel_work_timer+0x143/0x1c0 cancel_work_sync+0x10/0x20 blk_throtl_exit+0x25/0x60 blkcg_exit_queue+0x35/0x40 blk_release_queue+0x42/0x130 kobject_put+0xa9/0x190 This happens since we invoke callbacks that need to block from the queue release handler. Fix this by pushing the final release to a workqueue. Reported-by: NRoss Zwisler <zwisler@gmail.com> Fixes: commit b425e504 ("block: Avoid that blk_exit_rl() triggers a use-after-free") Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com> Tested-by: NRoss Zwisler <ross.zwisler@linux.intel.com> Updated changelog Signed-off-by: NJens Axboe <axboe@fb.com>
-
由 Magnus Damm 提交于
Update ->ndo_change_mtu() callback comment to remove text about returning error in case of undefined callback. This change makes the comment match the existing code behavior. Signed-off-by: NMagnus Damm <damm+renesas@opensource.se> Signed-off-by: NDavid S. Miller <davem@davemloft.net>
-
- 14 6月, 2017 3 次提交
-
-
由 Johannes Berg 提交于
Unfortunately, struct iwreq isn't a proper subset of struct ifreq, but is still handled by the same code path. Robert reported that then applications may (randomly) fault if the struct iwreq they pass happens to land within 8 bytes of the end of a mapping (the struct is only 32 bytes, vs. struct ifreq's 40 bytes). To fix this, pull out the code handling wireless extension ioctls and copy only the smaller structure in this case. This bug goes back a long time, I tracked that it was introduced into mainline in 2.1.15, over 20 years ago! This fixes https://bugzilla.kernel.org/show_bug.cgi?id=195869Reported-by: NRobert O'Callahan <robert@ocallahan.org> Signed-off-by: NJohannes Berg <johannes.berg@intel.com>
-
由 Dave Airlie 提交于
This interface allows importing the fence from a sync_file into an existing drm sync object, or exporting the fence attached to an existing drm sync object into a new sync file object. This should only be used to interact with sync files where necessary. v1.1: fence put fixes (Chris), drop fence from ioctl names (Chris) fixup for new fence replace API. Reviewed-by: NSean Paul <seanpaul@chromium.org> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
由 Dave Airlie 提交于
Sync objects are new toplevel drm object, that contain a pointer to a fence. This fence can be updated via command submission ioctls via drivers. There is also a generic wait obj API modelled on the vulkan wait API (with code modelled on some amdgpu code). These objects can be converted to an opaque fd that can be passes between processes. v2: rename reference/unreference to put/get (Chris) fix leaked reference (David Zhou) drop mutex in favour of cmpxchg (Chris) v3: cleanups from danvet, rebase on drm_fops rename check fd_flags is 0 in ioctls. v4: export find/free, change replace fence to take a syncobj. In order to support lookup first, replace later semantics which seem in the end to be cleaner. Reviewed-by: NSean Paul <seanpaul@chromium.org> Reviewed-by: NChris Wilson <chris@chris-wilson.co.uk> Signed-off-by: NDave Airlie <airlied@redhat.com>
-
- 12 6月, 2017 1 次提交
-
-
由 Lv Zheng 提交于
Considering this case: 1. A program opens a sysfs table file 65535 times, it can increase validation_count and first increment cause the table to be mapped: validation_count = 65535 2. AML execution causes "Load" to be executed on the same table, this time it cannot increase validation_count, so validation_count remains: validation_count = 65535 3. The program closes sysfs table file 65535 times, it can decrease validation_count and the last decrement cause the table to be unmapped: validation_count = 0 4. AML code still accessing the loaded table, kernel crash can be observed. To prevent that from happening, add a validation_count threashold. When it is reached, the validation_count can no longer be incremented/decremented to invalidate the table descriptor (means preventing table unmappings) Note that code added in acpi_tb_put_table() is actually a no-op but changes the warning message into a "warn once" one. Lv Zheng. Signed-off-by: NLv Zheng <lv.zheng@intel.com> [ rjw: Changelog, comments ] Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
-