- 21 7月, 2020 5 次提交
-
-
由 Christian König 提交于
Implementing those is completely unnecessary. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: NMadhav Chauhan <madhav.chauhan@amd.com> Link: https://patchwork.freedesktop.org/patch/378236/
-
由 Daniel Vetter 提交于
Comes up every few years, gets somewhat tedious to discuss, let's write this down once and for all. What I'm not sure about is whether the text should be more explicit in flat out mandating the amdkfd eviction fences for long running compute workloads or workloads where userspace fencing is allowed. v2: Now with dot graph! v3: Typo (Dave Airlie) Reviewed-by: NThomas Hellstrom <thomas.hellstrom@intel.com> Acked-by: NJason Ekstrand <jason@jlekstrand.net> Acked-by: NChristian König <christian.koenig@amd.com> Acked-by: NDaniel Stone <daniels@collabora.com> Acked-by: NDave Airlie <airlied@redhat.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Jesse Natalie <jenatali@microsoft.com> Cc: Steve Pronovost <spronovo@microsoft.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200709123339.547390-1-daniel.vetter@ffwll.ch
-
由 Daniel Vetter 提交于
Two in one go: - it is allowed to call dma_fence_wait() while holding a dma_resv_lock(). This is fundamental to how eviction works with ttm, so required. - it is allowed to call dma_fence_wait() from memory reclaim contexts, specifically from shrinker callbacks (which i915 does), and from mmu notifier callbacks (which amdgpu does, and which i915 sometimes also does, and probably always should, but that's kinda a debate). Also for stuff like HMM we really need to be able to do this, or things get real dicey. Consequence is that any critical path necessary to get to a dma_fence_signal for a fence must never a) call dma_resv_lock nor b) allocate memory with GFP_KERNEL. Also by implication of dma_resv_lock(), no userspace faulting allowed. That's some supremely obnoxious limitations, which is why we need to sprinkle the right annotations to all relevant paths. The one big locking context we're leaving out here is mmu notifiers, added in commit 23b68395 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Aug 26 22:14:21 2019 +0200 mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end that one covers a lot of other callsites, and it's also allowed to wait on dma-fences from mmu notifiers. But there's no ready-made functions exposed to prime this, so I've left it out for now. v2: Also track against mmu notifier context. v3: kerneldoc to spec the cross-driver contract. Note that currently i915 throws in a hard-coded 10s timeout on foreign fences (not sure why that was done, but it's there), which is why that rule is worded with SHOULD instead of MUST. Also some of the mmu_notifier/shrinker rules might surprise SoC drivers, I haven't fully audited them all. Which is infeasible anyway, we'll need to run them with lockdep and dma-fence annotations and see what goes boom. v4: A spelling fix from Mika v5: #ifdef for CONFIG_MMU_NOTIFIER. Reported by 0day. Unfortunately this means lockdep enforcement is slightly inconsistent, it won't spot GFP_NOIO and GFP_NOFS allocations in the wrong spot if CONFIG_MMU_NOTIFIER is disabled in the kernel config. Oh well. v5: Note that only drivers/gpu has a reasonable (or at least historical) excuse to use dma_fence_wait() from shrinker and mmu notifier callbacks. Everyone else should either have a better memory manager model, or better hardware. This reflects discussions with Jason Gunthorpe. Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: kernel test robot <lkp@intel.com> Acked-by: NChristian König <christian.koenig@amd.com> Acked-by: NDave Airlie <airlied@redhat.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> (v4) Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200707201229.472834-3-daniel.vetter@ffwll.ch
-
由 Daniel Vetter 提交于
Design is similar to the lockdep annotations for workers, but with some twists: - We use a read-lock for the execution/worker/completion side, so that this explicit annotation can be more liberally sprinkled around. With read locks lockdep isn't going to complain if the read-side isn't nested the same way under all circumstances, so ABBA deadlocks are ok. Which they are, since this is an annotation only. - We're using non-recursive lockdep read lock mode, since in recursive read lock mode lockdep does not catch read side hazards. And we _very_ much want read side hazards to be caught. For full details of this limitation see commit e9149858 Author: Peter Zijlstra <peterz@infradead.org> Date: Wed Aug 23 13:13:11 2017 +0200 locking/lockdep/selftests: Add mixed read-write ABBA tests - To allow nesting of the read-side explicit annotations we explicitly keep track of the nesting. lock_is_held() allows us to do that. - The wait-side annotation is a write lock, and entirely done within dma_fence_wait() for everyone by default. - To be able to freely annotate helper functions I want to make it ok to call dma_fence_begin/end_signalling from soft/hardirq context. First attempt was using the hardirq locking context for the write side in lockdep, but this forces all normal spinlocks nested within dma_fence_begin/end_signalling to be spinlocks. That bollocks. The approach now is to simple check in_atomic(), and for these cases entirely rely on the might_sleep() check in dma_fence_wait(). That will catch any wrong nesting against spinlocks from soft/hardirq contexts. The idea here is that every code path that's critical for eventually signalling a dma_fence should be annotated with dma_fence_begin/end_signalling. The annotation ideally starts right after a dma_fence is published (added to a dma_resv, exposed as a sync_file fd, attached to a drm_syncobj fd, or anything else that makes the dma_fence visible to other kernel threads), up to and including the dma_fence_wait(). Examples are irq handlers, the scheduler rt threads, the tail of execbuf (after the corresponding fences are visible), any workers that end up signalling dma_fences and really anything else. Not annotated should be code paths that only complete fences opportunistically as the gpu progresses, like e.g. shrinker/eviction code. The main class of deadlocks this is supposed to catch are: Thread A: mutex_lock(A); mutex_unlock(A); dma_fence_signal(); Thread B: mutex_lock(A); dma_fence_wait(); mutex_unlock(A); Thread B is blocked on A signalling the fence, but A never gets around to that because it cannot acquire the lock A. Note that dma_fence_wait() is allowed to be nested within dma_fence_begin/end_signalling sections. To allow this to happen the read lock needs to be upgraded to a write lock, which means that any other lock is acquired between the dma_fence_begin_signalling() call and the call to dma_fence_wait(), and still held, this will result in an immediate lockdep complaint. The only other option would be to not annotate such calls, defeating the point. Therefore these annotations cannot be sprinkled over the code entirely mindless to avoid false positives. Originally I hope that the cross-release lockdep extensions would alleviate the need for explicit annotations: https://lwn.net/Articles/709849/ But there's a few reasons why that's not an option: - It's not happening in upstream, since it got reverted due to too many false positives: commit e966eaee Author: Ingo Molnar <mingo@kernel.org> Date: Tue Dec 12 12:31:16 2017 +0100 locking/lockdep: Remove the cross-release locking checks This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y), while it found a number of old bugs initially, was also causing too many false positives that caused people to disable lockdep - which is arguably a worse overall outcome. - cross-release uses the complete() call to annotate the end of critical sections, for dma_fence that would be dma_fence_signal(). But we do not want all dma_fence_signal() calls to be treated as critical, since many are opportunistic cleanup of gpu requests. If these get stuck there's still the main completion interrupt and workers who can unblock everyone. Automatically annotating all dma_fence_signal() calls would hence cause false positives. - cross-release had some educated guesses for when a critical section starts, like fresh syscall or fresh work callback. This would again cause false positives without explicit annotations, since for dma_fence the critical sections only starts when we publish a fence. - Furthermore there can be cases where a thread never does a dma_fence_signal, but is still critical for reaching completion of fences. One example would be a scheduler kthread which picks up jobs and pushes them into hardware, where the interrupt handler or another completion thread calls dma_fence_signal(). But if the scheduler thread hangs, then all the fences hang, hence we need to manually annotate it. cross-release aimed to solve this by chaining cross-release dependencies, but the dependency from scheduler thread to the completion interrupt handler goes through hw where cross-release code can't observe it. In short, without manual annotations and careful review of the start and end of critical sections, cross-relese dependency tracking doesn't work. We need explicit annotations. v2: handle soft/hardirq ctx better against write side and dont forget EXPORT_SYMBOL, drivers can't use this otherwise. v3: Kerneldoc. v4: Some spelling fixes from Mika v5: Amend commit message to explain in detail why cross-release isn't the solution. v6: Pull out misplaced .rst hunk. Acked-by: NChristian König <christian.koenig@amd.com> Acked-by: NDave Airlie <airlied@redhat.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: NThomas Hellström <thomas.hellstrom@intel.com> Reviewed-by: NMaarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: NDaniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200707201229.472834-2-daniel.vetter@ffwll.ch
-
由 Christian König 提交于
The helper doesn't expose any not-mapable memory resources. Signed-off-by: NChristian König <christian.koenig@amd.com> Reviewed-by: NThomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/377649/
-
- 20 7月, 2020 12 次提交
-
-
由 Alexander A. Klimov 提交于
Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: NAlexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200719203714.61745-1-grandmaster@al2klimov.de
-
由 Alexander A. Klimov 提交于
Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: NAlexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200719171428.60470-1-grandmaster@al2klimov.de
-
由 Uwe Kleine-König 提交于
flags is unused since the driver was introduced in commit 45d59d70 ("drm: Add new driver for MXSFB controller"). Signed-off-by: NUwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: NStefan Agner <stefan@agner.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200716174139.16602-1-u.kleine-koenig@pengutronix.de
-
由 Guido Günther 提交于
In contrast to other display controllers on imx like DCSS and ipuv3 lcdif/mxsfb does not support detiling e.g. vivante tiled layouts. Since mesa might assume otherwise make it explicit that only DRM_FORMAT_MOD_LINEAR is supported. Signed-off-by: NGuido Günther <agx@sigxcpu.org> Reviewed-by: NLucas Stach <l.stach@pengutronix.de> Signed-off-by: NStefan Agner <stefan@agner.ch> Link: https://patchwork.freedesktop.org/patch/msgid/26877532e272c12a74c33188e2a72abafc9a2e1c.1584973664.git.agx@sigxcpu.org
-
由 Suraj Upadhyay 提交于
Convert device logging with dev_* functions into drm_* functions. The patch has been generated with the coccinelle script below. The script focuses on instances of dev_* functions where the drm device context is clearly visible in its arguments. @@expression E1; expression list E2; @@ -dev_warn(E1->dev, E2) +drm_warn(E1, E2) @@expression E1; expression list E2; @@ -dev_info(E1->dev, E2) +drm_info(E1, E2) @@expression E1; expression list E2; @@ -dev_err(E1->dev, E2) +drm_err(E1, E2) @@expression E1; expression list E2; @@ -dev_info_once(E1->dev, E2) +drm_info_once(E1, E2) @@expression E1; expression list E2; @@ -dev_notice_once(E1->dev, E2) +drm_notice_once(E1, E2) @@expression E1; expression list E2; @@ -dev_warn_once(E1->dev, E2) +drm_warn_once(E1, E2) @@expression E1; expression list E2; @@ -dev_err_once(E1->dev, E2) +drm_err_once(E1, E2) @@expression E1; expression list E2; @@ -dev_err_ratelimited(E1->dev, E2) +drm_err_ratelimited(E1, E2) @@expression E1; expression list E2; @@ -dev_dbg(E1->dev, E2) +drm_dbg(E1, E2) Signed-off-by: NSuraj Upadhyay <usuraj35@gmail.com> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200718150955.GA23103@blackclown
-
由 Christophe JAILLET 提交于
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below and has been hand modified to replace GFP_ with a correct flag. It has been compile tested. When memory is allocated in 'i810_dma_initialize()' GFP_KERNEL can be used because its only caller, 'i810_dma_init()', already use it and no lock is taken in the between. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Signed-off-by: NChristophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200718072822.339064-1-christophe.jaillet@wanadoo.fr
-
由 Thomas Zimmermann 提交于
Cleaning up ast's MM code with ast_mm_fini() resets the write-combine flags on the VRAM I/O memory. Drop ast_mm_fini() in favor of an auto- release callback. Releasing the device also executes the callback. Signed-off-by: NThomas Zimmermann <tzimmermann@suse.de> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-7-tzimmermann@suse.de
-
由 Thomas Zimmermann 提交于
Posting the GPU requires the correct DRAM type to be stored in struct ast_private. Therefore first initialize the DRAM info and then post the GPU. This restores the original order of instructions in this function. Signed-off-by: NThomas Zimmermann <tzimmermann@suse.de> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org> Fixes: bad09da6 ("drm/ast: Fixed vram size incorrect issue on POWER") Cc: Joel Stanley <joel@jms.id.au> Cc: Y.C. Chen <yc_chen@aspeedtech.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: "Y.C. Chen" <yc_chen@aspeedtech.com> Cc: <stable@vger.kernel.org> # v4.11+ Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-6-tzimmermann@suse.de
-
由 Thomas Zimmermann 提交于
VRAM size detection is only relevant to the memory management. Move the code into ast_mm.c. While at it, rename the function to ast_get_vram_size(). The function argument's type is now struct ast_private. The result is stored in a local variable and not in struct ast_private any longer. Signed-off-by: NThomas Zimmermann <tzimmermann@suse.de> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-5-tzimmermann@suse.de
-
由 Thomas Zimmermann 提交于
As a first step to managed MM code in ast, switch over VRAM MM helpers. v2: * updated to use drmm_vram_helper_init() Signed-off-by: NThomas Zimmermann <tzimmermann@suse.de> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-4-tzimmermann@suse.de
-
由 Thomas Zimmermann 提交于
Although built upon TTM, the ast driver's VRAM MM helper does not expose TTM to its users. Rename the related ast file to ast_mm.c. Signed-off-by: NThomas Zimmermann <tzimmermann@suse.de> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-3-tzimmermann@suse.de
-
由 Thomas Zimmermann 提交于
Calling drmm_vram_helper_init() sets up a managed instance of VRAM MM. Releasing the DRM device also frees the memory manager. The patch also updates the DRM documentation for VRAM helpers. The tutorial now describes the new managed interface. The old interfaces are deprecated and should not be used in new code. v2: * rename init function to drmm_vram_helper_init() * return errno code from init function; caller does not need vram_mm anyway * update documentation and remove docs for deprecated un-managed functions Signed-off-by: NThomas Zimmermann <tzimmermann@suse.de> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-2-tzimmermann@suse.de
-
- 19 7月, 2020 1 次提交
-
-
由 Paul Cercueil 提交于
Silence compiler warning about used but uninitialized 'ipu_state' variable. In practice, the variable would never be used when uninitialized, but the compiler cannot know that 'priv->ipu_plane' will always be NULL if CONFIG_INGENIC_IPU is disabled. Silence the warning by initializing the value to NULL. Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200719093834.14084-1-paul@crapouillou.net
-
- 17 7月, 2020 22 次提交
-
-
由 Paul Cercueil 提交于
Bump version to 1.1 and set date to 2020-07-16. v3: New patch Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-12-paul@crapouillou.net
-
由 Paul Cercueil 提交于
Support multiple panels or bridges connected to the same DPI output of the SoC. This setup can be found for instance on the GCW Zero, where the same DPI output interfaces the internal 320x240 TFT panel, and the ITE IT6610 HDMI chip. v2: No change v3: Allow > 80-char lines where it makes sense Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-11-paul@crapouillou.net
-
由 Paul Cercueil 提交于
Add support for the Image Processing Unit (IPU) found in all Ingenic SoCs. The IPU can upscale and downscale a source frame of arbitrary size ranging from 4x4 to 4096x4096 on newer SoCs, with bicubic filtering on newer SoCs, bilinear filtering on older SoCs. Nearest-neighbour can also be obtained with proper coefficients. Starting from the JZ4725B, the IPU supports a mode where its output is sent directly to the LCDC, without having to be written to RAM first. This makes it possible to use the IPU as a DRM plane on the compatible SoCs, and have it convert and scale anything the userspace asks for to what's available for the display. Regarding pixel formats, older SoCs support packed YUV 4:2:2 and various planar YUV formats. Newer SoCs introduced support for RGB. Since the IPU is a separate hardware block, to make it work properly the Ingenic DRM driver will now register itself as a component master in case the IPU driver has been enabled in the config. When enabled in the config, the CRTC will see the IPU as a second primary plane. It cannot be enabled at the same time as the regular primary plane. It has the same priority, which means that it will also display below the overlay plane. v2: - ingenic-ipu is no longer its own module. It will be built into the ingenic-drm module. - If enabled in the config, both the core driver and the IPU driver will register as components; otherwise the core driver will bypass that and call the ingenic_drm_bind() function directly. - Since both files now build into the same module, the symbols previously exported as GPL are not exported anymore, since they are only used internally. - Fix SPDX license header in ingenic-ipu.h - Avoid using 'for(;;);' loops without trailing statement(s) v3: - Pass priv structure to IRQ handler; that way we don't hardcode the expectation that the IPU plane is at index #0. - Rework osd_changed() to account for src_* changes - Add multiplanar YUV 4:4:4 support - Commit fb addresses to HW at vblank, since addr registers are not shadow registers - Probe IPU component later so that IPU plane is last - Fix driver not working on IPU-less hardware - Use IPU driver's name as the IRQ name to avoid having two 'ingenic-drm' in /proc/interrupts - Fix IPU only working for still images on JZ4725B - Add a bit more code comments Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-10-paul@crapouillou.net
-
由 Paul Cercueil 提交于
All Ingenic SoCs starting from the JZ4725B support OSD mode. In this mode, two separate planes can be used. They can have different positions and sizes, and one can be overlayed on top of the other. v2: Use fallthrough; instead of /* fall-through */ v3: - Add custom atomic_tail function to handle case where HW gives no VBLANK - Use regmap_set_bits() / regmap_clear_bits() when possible - Use dma_hwdesc_f{0,1} fields in priv structure instead of array - Use dmam_alloc_coherent() instead of dma_alloc_coherent() - Use more meaningful 0xf0 / 0xf1 values as DMA descriptors IDs - Add a bit more code comments Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-9-paul@crapouillou.net
-
由 Paul Cercueil 提交于
Use dmam_alloc_coherent() instead of dma_alloc_coherent(). Then we don't need to register a custom cleanup handler. v3: New patch Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-8-paul@crapouillou.net
-
由 Paul Cercueil 提交于
Move the register definitions to ingenic-drm.h, to keep ingenic-drm-drv.c tidy. v2: Fix SPDX license tag v3: No change Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-7-paul@crapouillou.net
-
由 Paul Cercueil 提交于
The address of the DMA descriptor never changes. It can therefore be set in the probe function. v2-v3: No change Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-6-paul@crapouillou.net
-
由 Paul Cercueil 提交于
If you pass a string that is not terminated with a carriage return to dev_err(), it will eventually be printed with a carriage return, but not right away, since the kernel will wait for a pr_cont(). v2: New patch v3: No change Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-5-paul@crapouillou.net
-
由 Paul Cercueil 提交于
Full rename without any modification, except to the Makefile. Renaming ingenic-drm.c to ingenic-drm-drv.c allow to decouple the module name from the source file name in the Makefile. This will be useful later when more source files are added. v2: New patch v3: No change Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-4-paul@crapouillou.net
-
由 Paul Cercueil 提交于
Add documentation of the Device Tree bindings for the Image Processing Unit (IPU) found in most Ingenic SoCs. v2: Add missing 'const' in items list v3: No change Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NRob Herring <robh@kernel.org> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-3-paul@crapouillou.net
-
由 Paul Cercueil 提交于
Convert the ingenic,lcd.txt to a new ingenic,lcd.yaml file. In the process, the new ingenic,jz4780-lcd compatible string has been added. v2: Add info about IPU at port@8 v3: No change Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NRob Herring <robh@kernel.org> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-2-paul@crapouillou.net
-
由 Paul Cercueil 提交于
plane->index is NOT the index of the color plane in a YUV frame. Actually, a YUV frame is represented by a single drm_plane, even though it contains three Y, U, V planes. v2-v3: No change Cc: stable@vger.kernel.org # v5.3 Fixes: 90b86fcc ("DRM: Add KMS driver for the Ingenic JZ47xx SoCs") Signed-off-by: NPaul Cercueil <paul@crapouillou.net> Reviewed-by: NSam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-1-paul@crapouillou.net
-
由 Lyude Paul 提交于
While I had thought I'd tested this before, it looks like this one issue slipped by my original CRC patches. Basically, there seem to be a few rules we need to follow when sending CRC commands to the display controller: * CRCs cannot be both disabled and enabled for a single head in the same flush * If a head with CRC reporting enabled switches from one OR to another, there must be a flush before the OR is re-enabled regardless of the final state of CRC reporting. So, split nv50_crc_atomic_prepare_notifier_contexts() into two functions: * nv_crc_atomic_release_notifier_contexts() - checks whether the CRC notifier contexts were released successfully after the first flush * nv_crc_atomic_init_notifier_contexts() - prepares any CRC notifier contexts for use before enabling reporting Additionally, in order to force a flush when we re-assign ORs with heads that have CRCs enabled we split our atomic check function into two: * nv50_crc_atomic_check_head() - called from our heads' atomic checks, determines whether a state needs to set or clear CRC reporting * nv50_crc_atomic_check_outp() - called at the end of the atomic check after all ORs have been added to the atomic state, and sets nv50_atom->flush_disable if needed Signed-off-by: NLyude Paul <lyude@redhat.com> Reviewed-by: NBen Skeggs <skeggsb@gmail.com> Acked-by: NDave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200629223635.103804-1-lyude@redhat.com
-
由 Lyude Paul 提交于
This introduces support for CRC readback on gf119+, using the documentation generously provided to us by Nvidia: https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed through a single set of "outp" sources: outp-active/auto for a CRC of the scanout region, outp-complete for a CRC of both the scanout and blanking/sync region combined, and outp-inactive for a CRC of only the blanking/sync region. For each source, nouveau selects the appropriate tap point based on the output path in use. We also expose an "rg" source, which allows for capturing CRCs of the scanout raster before it's encoded into a video signal in the output path. This tap point is referred to as the raster generator. Note that while there's some other neat features that can be used with CRC capture on nvidia hardware, like capturing from two CRC sources simultaneously, I couldn't see any usecase for them and did not implement them. Nvidia only allows for accessing CRCs through a shared DMA region that we program through the core EVO/NvDisplay channel which is referred to as the notifier context. The notifier context is limited to either 255 (for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and unfortunately the hardware simply drops CRCs and reports an overflow once all available entries in the notifier context are filled. Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit on how many CRCs can be captured, we work around this in nouveau by allocating two separate notifier contexts for each head instead of one. We schedule a vblank worker ahead of time so that once we start getting close to filling up all of the available entries in the notifier context, we can swap the currently used notifier context out with another pre-prepared notifier context in a manner similar to page flipping. Unfortunately, the hardware only allows us to this by flushing two separate updates on the core channel: one to release the current notifier context handle, and one to program the next notifier context's handle. When the hardware processes the first update, the CRC for the current frame is lost. However, the second update can be flushed immediately without waiting for the first to complete so that CRC generation resumes on the next frame. According to Nvidia's hardware engineers, there isn't any cleaner way of flipping notifier contexts that would avoid this. Since using vblank workers to swap out the notifier context will ensure we can usually flush both updates to hardware within the timespan of a single frame, we can also ensure that there will only be exactly one frame lost between the first and second update being executed by the hardware. This gives us the guarantee that we're always correctly matching each CRC entry with it's respective frame even after a context flip. And since IGT will retrieve the CRC entry for a frame by waiting until it receives a CRC for any subsequent frames, this doesn't cause an issue with any tests and is much simpler than trying to change the current DRM API to accommodate. In order to facilitate testing of correct handling of this limitation, we also expose a debugfs interface to manually control the threshold for when we start trying to flip the notifier context. We will use this in igt to trigger a context flip for testing purposes without needing to wait for the notifier to completely fill up. This threshold is reset to the default value set by nouveau after each capture, and is exposed in a separate folder within each CRTC's debugfs directory labelled "nv_crc". Changes since v1: * Forgot to finish saving crc.h before saving, whoops. This just adds some corrections to the empty function declarations that we use if CONFIG_DEBUG_FS isn't enabled. Changes since v2: * Don't check return code from debugfs_create_dir() or debugfs_create_file() - Greg K-H Changes since v3: (no functional changes) * Fix SPDX license identifiers (checkpatch) * s/uint32_t/u32/ (checkpatch) * Fix indenting in switch cases (checkpatch) Changes since v4: * Remove unneeded param changes with nv50_head_flush_clr/set * Rebase Changes since v5: * Remove set but unused variable (outp) in nv50_crc_atomic_check() - Kbuild bot Signed-off-by: NLyude Paul <lyude@redhat.com> Reviewed-by: NBen Skeggs <bskeggs@redhat.com> Acked-by: NDave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
-
由 Lyude Paul 提交于
While most of the functionality on Nvidia GPUs doesn't require using an explicit handle instead of the main VRAM handle + offset, there are a couple of places that do require explicit handles, such as CRC functionality. Since this means we're about to add another nouveau-chosen handle, let's just go ahead and move any hard-coded handles into a single header. This is just to keep things slightly organized, and to make it a little bit easier if we need to add more handles in the future. This patch should contain no functional changes. Changes since v3: * Correct SPDX license identifier (checkpatch) Signed-off-by: NLyude Paul <lyude@redhat.com> Reviewed-by: NBen Skeggs <bskeggs@redhat.com> Acked-by: NDave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-9-lyude@redhat.com
-
由 Lyude Paul 提交于
In order to make sure that we flush disable updates at the right time when disabling CRCs, we'll need to be able to look at the outp state to see if we're changing it at the same time that we're disabling CRCs. So, expose the struct in disp.h. Signed-off-by: NLyude Paul <lyude@redhat.com> Reviewed-by: NBen Skeggs <bskeggs@redhat.com> Acked-by: NDave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-8-lyude@redhat.com
-
由 Lyude Paul 提交于
While we're not quite ready yet to add support for flexible wndw mappings, we are going to need to at least keep track of the static wndw mappings we're currently using in each head's atomic state. We'll likely use this in the future to implement real flexible window mapping, but the primary reason we'll need this is for CRC support. See: on nvidia hardware, each CRC entry in the CRC notifier dma context has a "tag". This tag corresponds to the nth update on a specific EVO/NvDisplay channel, which itself is referred to as the "controlling channel". For gf119+ this can be the core channel, ovly channel, or base channel. Since we don't expose CRC entry tags to userspace, we simply ignore this feature and always use the core channel as the controlling channel. Simple. Things get a little bit more complicated on gv100+ though. GV100+ only lets us set the controlling channel to a specific wndw channel, and that wndw must be owned by the head that we're grabbing CRCs when we enable CRC generation. Thus, we always need to make sure that each atomic head state has at least one wndw that is mapped to the head, which will be used as the controlling channel. Note that since we don't have flexible wndw mappings yet, we don't expect to run into any scenarios yet where we'd have a head with no mapped wndws. When we do add support for flexible wndw mappings however, we'll need to make sure that we handle reprogramming CRC capture if our controlling wndw is moved to another head (and potentially reject the new head state entirely if we can't find another available wndw to replace it). With that being said, nouveau currently tracks wndw visibility on heads. It does not keep track of the actual ownership mappings, which are (currently) statically programmed. To fix this, we introduce another bitmask into nv50_head_atom.wndw to keep track of ownership separately from visibility. We then introduce a nv50_head callback to handle populating the wndw ownership map, and call it during the atomic check phase when core->assign_windows is set to true. Signed-off-by: NLyude Paul <lyude@redhat.com> Reviewed-by: NBen Skeggs <bskeggs@redhat.com> Acked-by: NDave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-7-lyude@redhat.com
-
由 Lyude Paul 提交于
While we expose the ability to turn off hardware dithering for nouveau, we actually make the mistake of turning it on anyway, due to dithering_depth containing a non-zero value if our dithering depth isn't also set to 6 bpc. So, fix it by never enabling dithering when it's disabled. Signed-off-by: NLyude Paul <lyude@redhat.com> Reviewed-by: NBen Skeggs <bskeggs@redhat.com> Acked-by: NDave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-6-lyude@redhat.com
-
由 Lyude Paul 提交于
Currently, we modify the depth value stored in the atomic state when performing a commit in order to workaround the fact we haven't implemented support for depths higher then 10 yet. This isn't idempotent though, as it will happen every atomic commit where we modify the OR state even if the head's depth in the atomic state hasn't been modified. Normally this wouldn't matter, since we don't modify OR state outside of modesets, but since the CRC capture region is implemented as part of the OR state in hardware we'll want to make sure all commits modifying OR state are idempotent so as to avoid changing the depth unexpectedly. So, fix this by simply not writing the reduced depth value we come up with to the atomic state. Signed-off-by: NLyude Paul <lyude@redhat.com> Reviewed-by: NBen Skeggs <bskeggs@redhat.com> Acked-by: NDave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-5-lyude@redhat.com
-
由 Lyude Paul 提交于
Add some kind of vblank workers. The interface is similar to regular delayed works, and is mostly based off kthread_work. It allows for scheduling delayed works that execute once a particular vblank sequence has passed. It also allows for accurate flushing of scheduled vblank works - in that flushing waits for both the vblank sequence and job execution to complete, or for the work to get cancelled - whichever comes first. Whatever hardware programming we do in the work must be fast (must at least complete during the vblank or scanout period, sometimes during the first few scanlines of the vblank). As such we use a high-priority per-CRTC thread to accomplish this. Changes since v7: * Stuff drm_vblank_internal.h and drm_vblank_work_internal.h contents into drm_internal.h * Get rid of unnecessary spinlock in drm_crtc_vblank_on() * Remove !vblank->worker check * Grab vbl_lock in drm_vblank_work_schedule() * Mention self-rearming work items in drm_vblank_work_schedule() kdocs * Return 1 from drm_vblank_work_schedule() if the work was scheduled successfully, 0 or error code otherwise * Use drm_dbg_core() instead of DRM_DEV_ERROR() in drm_vblank_work_schedule() * Remove vblank->worker checks in drm_vblank_destroy_worker() and drm_vblank_flush_worker() Changes since v6: * Get rid of ->pending and seqcounts, and implement flushing through simpler means - danvet * Get rid of work_lock, just use drm_device->event_lock * Move drm_vblank_work item cleanup into drm_crtc_vblank_off() so that we ensure that all vblank work has finished before disabling vblanks * Add checks into drm_crtc_vblank_reset() so we yell if it gets called while there's vblank workers active * Grab event_lock in both drm_crtc_vblank_on()/drm_crtc_vblank_off(), the main reason for this is so that other threads calling drm_vblank_work_schedule() are blocked from attempting to schedule while we're in the middle of enabling/disabling vblanks. * Move drm_handle_vblank_works() call below drm_handle_vblank_events() * Simplify drm_vblank_work_cancel_sync() * Fix drm_vblank_work_cancel_sync() documentation * Move wake_up_all() calls out of spinlock where we can. The only one I left was the call to wake_up_all() in drm_vblank_handle_works() as this seemed like it made more sense just living in that function (which is all technically under lock) * Move drm_vblank_work related functions into their own source files * Add drm_vblank_internal.h so we can export some functions we don't want drivers using, but that we do need to use in drm_vblank_work.c * Add a bunch of documentation Changes since v4: * Get rid of kthread interfaces we tried adding and move all of the locking into drm_vblank.c. For implementing drm_vblank_work_flush(), we now use a wait_queue and sequence counters in order to differentiate between multiple work item executions. * Get rid of drm_vblank_work_cancel() - this would have been pretty difficult to actually reimplement and it occurred to me that neither nouveau or i915 are even planning to use this function. Since there's also no async cancel function for most of the work interfaces in the kernel, it seems a bit unnecessary anyway. * Get rid of to_drm_vblank_work() since we now are also able to just pass the struct drm_vblank_work to work item callbacks anyway Changes since v3: * Use our own spinlocks, don't integrate so tightly with kthread_works Changes since v2: * Use kthread_workers instead of reinventing the wheel. Cc: Tejun Heo <tj@kernel.org> Cc: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Reviewed-by: NDaniel Vetter <daniel.vetter@ffwll.ch> Co-developed-by: NVille Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: NLyude Paul <lyude@redhat.com> Acked-by: NDave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-4-lyude@redhat.com
-
由 Lyude Paul 提交于
This got me confused for a bit while looking over this code: I had been planning on adding some blocking function calls into this function, but seeing the irqsave/irqrestore variants of spin_(un)lock() didn't make it very clear whether or not that would actually be safe. So I went ahead and reviewed every single driver in the kernel that uses this function, and they all fall into three categories: * Driver probe code * ->atomic_disable() callbacks * Legacy modesetting callbacks All of these will be guaranteed to have IRQs enabled, which means it's perfectly safe to block here. Just to make things a little less confusing to others in the future, let's switch over to spin_lock_irq()/spin_unlock_irq() to make that fact a little more obvious. Signed-off-by: NLyude Paul <lyude@redhat.com> Reviewed-by: NDaniel Vetter <daniel@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: NDave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-3-lyude@redhat.com
-
由 Lyude Paul 提交于
Since we'll be allocating resources for kthread_create_worker() in the next commit (which could fail and require us to clean up the mess), let's simplify the cleanup process a bit by registering a drm_vblank_init_release() action for each drm_vblank_crtc so they're still cleaned up if we fail to initialize one of them. Changes since v3: * Use drmm_add_action_or_reset() - Daniel Vetter Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Reviewed-by: NDaniel Vetter <daniel@ffwll.ch> Signed-off-by: NLyude Paul <lyude@redhat.com> Acked-by: NDave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-2-lyude@redhat.com
-