- 08 5月, 2018 1 次提交
-
-
由 Christoph Hellwig 提交于
Most mainstream architectures are using 65536 entries, so lets stick to that. If someone is really desperate to override it that can still be done through <asm/dma-mapping.h>, but I'd rather see a really good rationale for that. dma_debug_init is now called as a core_initcall, which for many architectures means much earlier, and provides dma-debug functionality earlier in the boot process. This should be safe as it only relies on the memory allocator already being available. Signed-off-by: NChristoph Hellwig <hch@lst.de> Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
-
- 12 4月, 2018 1 次提交
-
-
由 Joonsoo Kim 提交于
CMA area is now managed by the separate zone, ZONE_MOVABLE, to fix many MM related problems. In this implementation, if CONFIG_HIGHMEM = y, then ZONE_MOVABLE is considered as HIGHMEM and the memory of the CMA area is also considered as HIGHMEM. That means that they are considered as the page without direct mapping. However, CMA area could be in a lowmem and the memory could have direct mapping. In ARM, when establishing a new mapping for DMA, direct mapping should be cleared since two mapping with different cache policy could cause unknown problem. With this patch, PageHighmem() for the CMA memory located in lowmem returns true so that the function for DMA mapping cannot notice whether it needs to clear direct mapping or not, correctly. To handle this situation, this patch always clears direct mapping for such CMA memory. Link: http://lkml.kernel.org/r/1512114786-5085-4-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com> Tested-by: NTony Lindgren <tony@atomide.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 28 9月, 2017 3 次提交
-
-
由 Vladimir Murzin 提交于
There are no users of init_dma_coherent_pool_size() left due to 387870f2 ("mm: dmapool: use provided gfp flags for all dma_alloc_coherent() calls"), so remove it. Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
由 Vladimir Murzin 提交于
atomic_pool is setup once while init stage and never changed after that, so it is good candidate for __ro_after_init. Since we are here mark atomic_pool_size with __init_data. Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
由 Vladimir Murzin 提交于
gen_pool_first_fit_order_align() does not make use of additional data, so pass plain NULL there. Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
- 20 7月, 2017 1 次提交
-
-
由 Vladimir Murzin 提交于
Christoph noticed [1] that default DMA pool in current form overload the DMA coherent infrastructure. In reply, Robin suggested [2] to split the per-device vs. global pool interfaces, so allocation/release from default DMA pool is driven by dma ops implementation. This patch implements Robin's idea and provide interface to allocate/release/mmap the default (aka global) DMA pool. To make it clear that existing *_from_coherent routines work on per-device pool rename them to *_from_dev_coherent. [1] https://lkml.org/lkml/2017/7/7/370 [2] https://lkml.org/lkml/2017/7/7/431 Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Suggested-by: NRobin Murphy <robin.murphy@arm.com> Tested-by: NAndras Szemzo <sza@esh.hu> Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 01 7月, 2017 1 次提交
-
-
由 Vladimir Murzin 提交于
DMA operations for NOMMU case have been just factored out into separate compilation unit, so don't keep dead code. Tested-by: NBenjamin Gaignard <benjamin.gaignard@linaro.org> Tested-by: NAndras Szemzo <sza@esh.hu> Tested-by: NAlexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: NVladimir Murzin <vladimir.murzin@arm.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NRussell King <rmk+kernel@armlinux.org.uk> Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 28 6月, 2017 2 次提交
-
-
由 Christoph Hellwig 提交于
And instead wire it up as method for all the dma_map_ops instances. Note that the code seems a little fishy for dmabounce and iommu, but for now I'd like to preserve the existing behavior 1:1. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
由 Christoph Hellwig 提交于
DMA_ERROR_CODE is going to go away, so don't rely on it. Signed-off-by: NChristoph Hellwig <hch@lst.de>
-
- 30 5月, 2017 2 次提交
-
-
由 Sricharan R 提交于
arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops() ,dma_ops should be cleared in the teardown path. Currently, only the device's iommu mapping structures are cleared in arch_teardown_dma_ops, but not the dma_ops. So on the next reprobe, dma_ops left in place is stale from the first IOMMU setup, but iommu mappings has been disposed of. This is a problem when the probe of the device is deferred and recalled with the IOMMU probe deferral. So for fixing this, slightly refactor by moving the code from __arm_iommu_detach_device to arm_iommu_detach_device and cleanup the former. This takes care of resetting the dma_ops in the teardown path. Fixes: 09515ef5 ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices") Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: NSricharan R <sricharan@codeaurora.org> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
由 Laurent Pinchart 提交于
arch_setup_dma_ops() is used in device probe code paths to create an IOMMU mapping and attach it to the device. The function assumes that the device is attached to a device-specific IOMMU instance (or at least a device-specific TLB in a shared IOMMU instance) and thus creates a separate mapping for every device. On several systems (Renesas R-Car Gen2 being one of them), that assumption is not true, and IOMMU mappings must be shared between multiple devices. In those cases the IOMMU driver knows better than the generic ARM dma-mapping layer and attaches mapping to devices manually with arm_iommu_attach_device(), which sets the DMA ops for the device. The arch_setup_dma_ops() function takes this into account and bails out immediately if the device already has DMA ops assigned. However, the corresponding arch_teardown_dma_ops() function, called from driver unbind code paths (including probe deferral), will tear the mapping down regardless of who created it. When the device is reprobed arch_setup_dma_ops() will be called again but won't perform any operation as the DMA ops will still be set. We need to reset the DMA ops in arch_teardown_dma_ops() to fix this. However, we can't do so unconditionally, as then a new mapping would be created by arch_setup_dma_ops() when the device is reprobed, regardless of whether the device needs to share a mapping or not. We must thus keep track of whether arch_setup_dma_ops() created the mapping, and only in that case tear it down in arch_teardown_dma_ops(). Keep track of that information in the dev_archdata structure. As the structure is embedded in all instances of struct device let's not grow it, but turn the existing dma_coherent bool field into a bitfield that can be used for other purposes. Fixes: 09515ef5 ("of/acpi: Configure dma operations at probe time for platform/amba/pci bus devices") Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 02 5月, 2017 1 次提交
-
-
由 Stefano Stabellini 提交于
The following commit: commit 815dd187 Author: Bart Van Assche <bart.vanassche@sandisk.com> Date: Fri Jan 20 13:04:04 2017 -0800 treewide: Consolidate get_dma_ops() implementations rearranges get_dma_ops in a way that xen_dma_ops are not returned when running on Xen anymore, dev->dma_ops is returned instead (see arch/arm/include/asm/dma-mapping.h:get_arch_dma_ops and include/linux/dma-mapping.h:get_dma_ops). Fix the problem by storing dev->dma_ops in dev_archdata, and setting dev->dma_ops to xen_dma_ops. This way, xen_dma_ops is returned naturally by get_dma_ops. The Xen code can retrieve the original dev->dma_ops from dev_archdata when needed. It also allows us to remove __generic_dma_ops from common headers. Signed-off-by: NStefano Stabellini <sstabellini@kernel.org> Tested-by: NJulien Grall <julien.grall@arm.com> Suggested-by: NCatalin Marinas <catalin.marinas@arm.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> [4.11+] CC: linux@armlinux.org.uk CC: catalin.marinas@arm.com CC: will.deacon@arm.com CC: boris.ostrovsky@oracle.com CC: jgross@suse.com CC: Julien Grall <julien.grall@arm.com>
-
- 29 4月, 2017 1 次提交
-
-
由 Laurent Pinchart 提交于
The arch_setup_dma_ops() function is in charge of setting dma_ops with a call to set_dma_ops(). set_dma_ops() is also called from - highbank and mvebu bus notifiers - dmabounce (to be replaced with swiotlb) - arm_iommu_attach_device (arm_iommu_attach_device is itself called from IOMMU and bus master device drivers) To allow the arch_setup_dma_ops() call to be moved from device add time to device probe time we must ensure that dma_ops already setup by any of the above callers will not be overriden. Aftering replacing dmabounce with swiotlb, converting IOMMU drivers to of_xlate and taking care of highbank and mvebu, the workaround should be removed. Signed-off-by: NSricharan R <sricharan@codeaurora.org> Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Tested-by: NRalph Sennhauser <ralph.sennhauser@gmail.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 30 3月, 2017 1 次提交
-
-
由 Russell King 提交于
dma_get_sgtable() tries to create a scatterlist table containing valid struct page pointers for the coherent memory allocation passed in to it. However, memory can be declared via dma_declare_coherent_memory(), or via other reservation schemes which means that coherent memory is not guaranteed to be backed by struct pages. In such cases, the resulting scatterlist table contains pointers to invalid pages, which causes kernel oops later. This patch adds detection of such memory, and refuses to create a scatterlist table for such memory. Reported-by: NShuah Khan <shuahkhan@gmail.com> Signed-off-by: NRussell King <rmk+kernel@armlinux.org.uk>
-
- 25 2月, 2017 1 次提交
-
-
由 Lucas Stach 提交于
The callers of the DMA alloc functions already provide the proper context GFP flags. Make sure to pass them through to the CMA allocator, to make the CMA compaction context aware. Link: http://lkml.kernel.org/r/20170127172328.18574-3-l.stach@pengutronix.deSigned-off-by: NLucas Stach <l.stach@pengutronix.de> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Cc: Radim Krcmar <rkrcmar@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Alexander Graf <agraf@suse.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 25 1月, 2017 1 次提交
-
-
由 Bart Van Assche 提交于
Most dma_map_ops structures are never modified. Constify these structures such that these can be write-protected. This patch has been generated as follows: git grep -l 'struct dma_map_ops' | xargs -d\\n sed -i \ -e 's/struct dma_map_ops/const struct dma_map_ops/g' \ -e 's/const struct dma_map_ops {/struct dma_map_ops {/g' \ -e 's/^const struct dma_map_ops;$/struct dma_map_ops;/' \ -e 's/const const struct dma_map_ops /const struct dma_map_ops /g'; sed -i -e 's/const \(struct dma_map_ops intel_dma_ops\)/\1/' \ $(git grep -l 'struct dma_map_ops intel_dma_ops'); sed -i -e 's/const \(struct dma_map_ops dma_iommu_ops\)/\1/' \ $(git grep -l 'struct dma_map_ops' | grep ^arch/powerpc); sed -i -e '/^struct vmd_dev {$/,/^};$/ s/const \(struct dma_map_ops[[:blank:]]dma_ops;\)/\1/' \ -e '/^static void vmd_setup_dma_ops/,/^}$/ s/const \(struct dma_map_ops \*dest\)/\1/' \ -e 's/const \(struct dma_map_ops \*dest = \&vmd->dma_ops\)/\1/' \ drivers/pci/host/*.c sed -i -e '/^void __init pci_iommu_alloc(void)$/,/^}$/ s/dma_ops->/intel_dma_ops./' arch/ia64/kernel/pci-dma.c sed -i -e 's/static const struct dma_map_ops sn_dma_ops/static struct dma_map_ops sn_dma_ops/' arch/ia64/sn/pci/pci_dma.c sed -i -e 's/(const struct dma_map_ops \*)//' drivers/misc/mic/bus/vop_bus.c Signed-off-by: NBart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Russell King <linux@armlinux.org.uk> Cc: x86@kernel.org Signed-off-by: NDoug Ledford <dledford@redhat.com>
-
- 19 1月, 2017 1 次提交
-
-
由 Sricharan R 提交于
The newly added DMA_ATTR_PRIVILEGED is useful for creating mappings that are only accessible to privileged DMA engines. Adding it to the arm dma-mapping.c so that the ARM32 DMA IOMMU mapper can make use of it. Signed-off-by: NSricharan R <sricharan@codeaurora.org> Reviewed-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NWill Deacon <will.deacon@arm.com>
-
- 11 1月, 2017 1 次提交
-
-
由 Benjamin Gaignard 提交于
commit ab6494f0 ("nommu: Add noMMU support to the DMA API") have add CONFIG_MMU compilation flag but that prohibit to use dma_mmap_wc() when the platform doesn't have MMU. This patch call vm_iomap_memory() in noMMU case to test if addresses are correct and set vma->vm_flags rather than all return an error. Signed-off-by: NBenjamin Gaignard <benjamin.gaignard@linaro.org> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 15 11月, 2016 1 次提交
-
-
由 Marek Szyprowski 提交于
fs_initcall is definitely too late to initialize DMA-debug hash tables, because some drivers might get probed and use DMA mapping framework already in core_initcall. Late initialization of DMA-debug results in false warning about accessing memory, that was not allocated, like this one: ------------[ cut here ]------------ WARNING: CPU: 5 PID: 1 at lib/dma-debug.c:1104 check_unmap+0xa1c/0xe50 exynos-sysmmu 10a60000.sysmmu: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x000000006ebd0000] [size=16384 bytes] Modules linked in: CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc5-00028-g39dde3d-dirty #44 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c0119dd4>] (unwind_backtrace) from [<c01122bc>] (show_stack+0x20/0x24) [<c01122bc>] (show_stack) from [<c062714c>] (dump_stack+0x84/0xa0) [<c062714c>] (dump_stack) from [<c0132560>] (__warn+0x14c/0x180) [<c0132560>] (__warn) from [<c01325dc>] (warn_slowpath_fmt+0x48/0x50) [<c01325dc>] (warn_slowpath_fmt) from [<c06814f8>] (check_unmap+0xa1c/0xe50) [<c06814f8>] (check_unmap) from [<c06819c4>] (debug_dma_unmap_page+0x98/0xc8) [<c06819c4>] (debug_dma_unmap_page) from [<c076c3e8>] (exynos_iommu_domain_free+0x158/0x380) [<c076c3e8>] (exynos_iommu_domain_free) from [<c0764a30>] (iommu_domain_free+0x34/0x60) [<c0764a30>] (iommu_domain_free) from [<c011f168>] (release_iommu_mapping+0x30/0xb8) [<c011f168>] (release_iommu_mapping) from [<c011f23c>] (arm_iommu_release_mapping+0x4c/0x50) [<c011f23c>] (arm_iommu_release_mapping) from [<c0b061ac>] (s5p_mfc_probe+0x640/0x80c) [<c0b061ac>] (s5p_mfc_probe) from [<c07e6750>] (platform_drv_probe+0x70/0x148) [<c07e6750>] (platform_drv_probe) from [<c07e25c0>] (driver_probe_device+0x12c/0x6b0) [<c07e25c0>] (driver_probe_device) from [<c07e2c6c>] (__driver_attach+0x128/0x17c) [<c07e2c6c>] (__driver_attach) from [<c07df74c>] (bus_for_each_dev+0x88/0xc8) [<c07df74c>] (bus_for_each_dev) from [<c07e1b6c>] (driver_attach+0x34/0x58) [<c07e1b6c>] (driver_attach) from [<c07e1350>] (bus_add_driver+0x18c/0x32c) [<c07e1350>] (bus_add_driver) from [<c07e4198>] (driver_register+0x98/0x148) [<c07e4198>] (driver_register) from [<c07e5cb0>] (__platform_driver_register+0x58/0x74) [<c07e5cb0>] (__platform_driver_register) from [<c174cb30>] (s5p_mfc_driver_init+0x1c/0x20) [<c174cb30>] (s5p_mfc_driver_init) from [<c0102690>] (do_one_initcall+0x64/0x258) [<c0102690>] (do_one_initcall) from [<c17014c0>] (kernel_init_freeable+0x3d0/0x4d0) [<c17014c0>] (kernel_init_freeable) from [<c116eeb4>] (kernel_init+0x18/0x134) [<c116eeb4>] (kernel_init) from [<c010bbd8>] (ret_from_fork+0x14/0x3c) ---[ end trace dc54c54bd3581296 ]--- This patch moves initialization of DMA-debug to core_initcall. This is safe from the initialization perspective. dma_debug_do_init() internally calls debugfs functions and debugfs also gets initialised at core_initcall(), and that is earlier than arch code in the link order, so it will get initialized just before the DMA-debug. Reported-by: NSeung-Woo Kim <sw0312.kim@samsung.com> Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 27 9月, 2016 1 次提交
-
-
由 Niklas Söderlund 提交于
Add methods to map/unmap device resources addresses for dma_map_ops that are IOMMU aware. This is needed to map a device MMIO register from a physical address. Signed-off-by: NNiklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NVinod Koul <vinod.koul@intel.com>
-
- 12 8月, 2016 1 次提交
-
-
由 Fabio Estevam 提交于
According to Documentation/printk-formats.txt when printing a size_t variable we should use %zu or %zx format specifiers. As we are printing a memory size value, we should better use %zu in this case. Reported-by: NFrank Mori Hess <fmh6jj@gmail.com> Signed-off-by: NFabio Estevam <fabio.estevam@nxp.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 04 8月, 2016 1 次提交
-
-
由 Krzysztof Kozlowski 提交于
The dma-mapping core and the implementations do not change the DMA attributes passed by pointer. Thus the pointer can point to const data. However the attributes do not have to be a bitfield. Instead unsigned long will do fine: 1. This is just simpler. Both in terms of reading the code and setting attributes. Instead of initializing local attributes on the stack and passing pointer to it to dma_set_attr(), just set the bits. 2. It brings safeness and checking for const correctness because the attributes are passed by value. Semantic patches for this change (at least most of them): virtual patch virtual context @r@ identifier f, attrs; @@ f(..., - struct dma_attrs *attrs + unsigned long attrs , ...) { ... } @@ identifier r.f; @@ f(..., - NULL + 0 ) and // Options: --all-includes virtual patch virtual context @r@ identifier f, attrs; type t; @@ t f(..., struct dma_attrs *attrs); @@ identifier r.f; @@ f(..., - NULL + 0 ) Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.comSigned-off-by: NKrzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: NVineet Gupta <vgupta@synopsys.com> Acked-by: NRobin Murphy <robin.murphy@arm.com> Acked-by: NHans-Christian Noren Egtvedt <egtvedt@samfundet.no> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris] Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm] Reviewed-by: NBart Van Assche <bart.vanassche@sandisk.com> Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp] Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core] Acked-by: David Vrabel <david.vrabel@citrix.com> [xen] Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb] Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: NBjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32] Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc] Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu] Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 14 7月, 2016 2 次提交
-
-
由 Gregory CLEMENT 提交于
When doing dma allocation with IOMMU the __iommu_alloc_atomic() was used even when the system was coherent. However, this function allocates from a non-cacheable pool, which is fine when the device is not cache coherent but won't work as expected in the device is cache coherent. Indeed, the CPU and device must access the memory using the same cacheability attributes. Moreover when the devices are coherent, the mmap call must not change the pg_prot flags in the vma struct. The arm_coherent_iommu_mmap_attrs has been updated in the same way that it was done for the arm_dma_mmap in commit 55af8a91 ("ARM: 8387/1: arm/mm/dma-mapping.c: Add arm_coherent_dma_mmap"). Suggested-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Gregory CLEMENT 提交于
When a L2 cache controller is used in a system that provides hardware coherency, the entire outer cache operations are useless, and can be skipped. Moreover, on some systems, it is harmful as it causes deadlocks between the Marvell coherency mechanism, the Marvell PCIe controller and the Cortex-A9. In the current kernel implementation, the outer cache flush range operation is triggered by the dma_alloc function. This operation can be take place during runtime and in some circumstances may lead to the PCIe/PL310 deadlock on Armada 375/38x SoCs. This patch extends the __dma_clear_buffer() function to receive a boolean argument related to the coherency of the system. The same things is done for the calling functions. Reported-by: NNadav Haklai <nadavh@marvell.com> Signed-off-by: NGregory CLEMENT <gregory.clement@free-electrons.com> Cc: <stable@vger.kernel.org> # v3.16+ Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 09 5月, 2016 1 次提交
-
-
由 Robin Murphy 提交于
As a set of driver-provided callbacks and static data, there is no compelling reason for struct iommu_ops to be mutable in core code, so enforce const-ness throughout. Acked-by: NThierry Reding <treding@nvidia.com> Signed-off-by: NRobin Murphy <robin.murphy@arm.com> Acked-by: NWill Deacon <will.deacon@arm.com> Signed-off-by: NJoerg Roedel <jroedel@suse.de>
-
- 15 4月, 2016 1 次提交
-
-
由 Alexandre Courbot 提交于
Commit 19e6e5e5 ("ARM: 8547/1: dma-mapping: store buffer information") allocates a structure meant for internal buffer management with the GFP flags of the buffer itself. This can trigger the following safeguard in the slab/slub allocator: if (unlikely(flags & GFP_SLAB_BUG_MASK)) { pr_emerg("gfp: %un", flags & GFP_SLAB_BUG_MASK); BUG(); } Fix this by filtering the flags that make the slab allocator unhappy. Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com> Acked-by: NRabin Vincent <rabin@rab.in> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 08 4月, 2016 1 次提交
-
-
由 Alexandre Courbot 提交于
arm_dma_set_mask() implements exactly the same behavior as the fallback that dma_set_mask() takes if the set_dma_mask op is not set. Remove it and use that fallback instead like what is already done for dma_get_mask(). Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 05 3月, 2016 2 次提交
-
-
由 Rabin Vincent 提交于
Given a device which uses arm_coherent_dma_ops and on which dev_get_cma_area(dev) returns non-NULL, the following usage of the DMA API with gfp=0 results in memory corruption and a memory leak. p = dma_alloc_coherent(dev, sz, &dma, 0); if (p) dma_free_coherent(dev, sz, p, dma); The memory leak is because the alloc allocates using __alloc_simple_buffer() but the free attempts dma_release_from_contiguous() which does not do free anything since the page is not in the CMA area. The memory corruption is because the free calls __dma_remap() on a page which is backed by only first level page tables. The apply_to_page_range() + __dma_update_pte() loop ends up interpreting the section mapping as an addresses to a second level page table and writing the new PTE to memory which is not used by page tables. We don't have access to the GFP flags used for allocation in the free function. Fix this by adding allocator backends and using this information in the free function so that we always use the correct release routine. Fixes: 21caf3a7 ("ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMA") Signed-off-by: NRabin Vincent <rabin.vincent@axis.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Rabin Vincent 提交于
Keep a list of allocated DMA buffers so that we can store metadata in alloc() which we later need in free(). Signed-off-by: NRabin Vincent <rabin.vincent@axis.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 11 2月, 2016 2 次提交
-
-
由 Doug Anderson 提交于
If we know that TLB efficiency will not be an issue when memory is accessed then it's not terribly important to allocate big chunks of memory. The whole point of allocating the big chunks was that it would make TLB usage efficient. As Marek Szyprowski indicated: Please note that mapping memory with larger pages significantly improves performance, especially when IOMMU has a little TLB cache. This can be easily observed when multimedia devices do processing of RGB data with 90/270 degree rotation Image rotation is distinctly an operation that needs to bounce around through memory, so it makes sense that TLB efficiency is important there. Video decoding, on the other hand, is a fairly sequential operation. During video decoding it's not expected that we'll be jumping all over memory. Decoding video is also pretty heavy and the TLB misses aren't a huge deal. Presumably most HW video acceleration users of dma-mapping will not care about huge pages and will set DMA_ATTR_ALLOC_SINGLE_PAGES. Allocating big chunks of memory is quite expensive, especially if we're doing it repeadly and memory is full. In one (out of tree) usage model it is common that arm_iommu_alloc_attrs() is called 16 times in a row, each one trying to allocate 4 MB of memory. This is called whenever the system encounters a new video, which could easily happen while the memory system is stressed out. In fact, on certain social media websites that auto-play video and have infinite scrolling, it's quite common to see not just one of these 16x4MB allocations but 2 or 3 right after another. Asking the system even to do a small amount of extra work to give us big chunks in this case is just not a good use of time. Allocating big chunks of memory is also expensive indirectly. Even if we ask the system not to do ANY extra work to allocate _our_ memory, we're still potentially eating up all big chunks in the system. Presumably there are other users in the system that aren't quite as flexible and that actually need these big chunks. By eating all the big chunks we're causing extra work for the rest of the system. We also may start making other memory allocations fail. While the system may be robust to such failures (as is the case with dwc2 USB trying to allocate buffers for Ethernet data and with WiFi trying to allocate buffers for WiFi data), it is yet another big performance hit. Signed-off-by: NDouglas Anderson <dianders@chromium.org> Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com> Tested-by: NJavier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Doug Anderson 提交于
The __iommu_alloc_buffer() is expected to be called to allocate pretty sizeable buffers. Upon simple tests of video I saw it trying to allocate 4,194,304 bytes. The function tries to allocate large chunks in order to optimize IOMMU TLB usage. The current function is very, very slow. One problem is the way it keeps trying and trying to allocate big chunks. Imagine a very fragmented memory that has 4M free but no contiguous pages at all. Further imagine allocating 4M (1024 pages). We'll do the following memory allocations: - For page 1: - Try to allocate order 10 (no retry) - Try to allocate order 9 (no retry) - ... - Try to allocate order 0 (with retry, but not needed) - For page 2: - Try to allocate order 9 (no retry) - Try to allocate order 8 (no retry) - ... - Try to allocate order 0 (with retry, but not needed) - ... - ... Total number of calls to alloc() calls for this case is: sum(int(math.log(i, 2)) + 1 for i in range(1, 1025)) => 9228 The above is obviously worse case, but given how slow alloc can be we really want to try to avoid even somewhat bad cases. I timed the old code with a device under memory pressure and it wasn't hard to see it take more than 120 seconds to allocate 4 megs of memory! (NOTE: testing was done on kernel 3.14, so possibly mainline would behave differently). A second problem is that allocating big chunks under memory pressure when we don't need them is just not a great idea anyway unless we really need them. We can make due pretty well with smaller chunks so it's probably wise to leave bigger chunks for other users once memory pressure is on. Let's adjust the allocation like this: 1. If a big chunk fails, stop trying to hard and bump down to lower order allocations. 2. Don't try useless orders. The whole point of big chunks is to optimize the TLB and it can really only make use of 2M, 1M, 64K and 4K sizes. We'll still tend to eat up a bunch of big chunks, but that might be the right answer for some users. A future patch could possibly add a new DMA_ATTR that would let the caller decide that TLB optimization isn't important and that we should use smaller chunks. Presumably this would be a sane strategy for some callers. Signed-off-by: NDouglas Anderson <dianders@chromium.org> Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: NRobin Murphy <robin.murphy@arm.com> Reviewed-by: NTomasz Figa <tfiga@chromium.org> Tested-by: NJavier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 23 1月, 2016 1 次提交
-
-
由 Tetsuo Handa 提交于
There are many locations that do if (memory_was_allocated_by_vmalloc) vfree(ptr); else kfree(ptr); but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory using is_vmalloc_addr(). Unless callers have special reasons, we can replace this branch with kvfree(). Please check and reply if you found problems. Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NJan Kara <jack@suse.com> Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk> Reviewed-by: NAndreas Dilger <andreas.dilger@intel.com> Acked-by: N"Rafael J. Wysocki" <rjw@rjwysocki.net> Acked-by: NDavid Rientjes <rientjes@google.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Boris Petkov <bp@suse.de> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 16 12月, 2015 1 次提交
-
-
由 Dan Williams 提交于
commit db0fa0cb "scatterlist: use sg_phys()" did replacements of the form: phys_addr_t phys = page_to_phys(sg_page(s)); phys_addr_t phys = sg_phys(s) & PAGE_MASK; However, this breaks platforms where sizeof(phys_addr_t) > sizeof(unsigned long). Revert for 4.3 and 4.4 to make room for a combined helper in 4.5. Cc: <stable@vger.kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Fixes: db0fa0cb ("scatterlist: use sg_phys()") Suggested-by: NJoerg Roedel <joro@8bytes.org> Reported-by: NVitaly Lavrov <vel21ripn@gmail.com> Signed-off-by: NDan Williams <dan.j.williams@intel.com>
-
- 07 11月, 2015 1 次提交
-
-
由 Mel Gorman 提交于
mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd __GFP_WAIT has been used to identify atomic context in callers that hold spinlocks or are in interrupts. They are expected to be high priority and have access one of two watermarks lower than "min" which can be referred to as the "atomic reserve". __GFP_HIGH users get access to the first lower watermark and can be called the "high priority reserve". Over time, callers had a requirement to not block when fallback options were available. Some have abused __GFP_WAIT leading to a situation where an optimisitic allocation with a fallback option can access atomic reserves. This patch uses __GFP_ATOMIC to identify callers that are truely atomic, cannot sleep and have no alternative. High priority users continue to use __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify callers that want to wake kswapd for background reclaim. __GFP_WAIT is redefined as a caller that is willing to enter direct reclaim and wake kswapd for background reclaim. This patch then converts a number of sites o __GFP_ATOMIC is used by callers that are high priority and have memory pools for those requests. GFP_ATOMIC uses this flag. o Callers that have a limited mempool to guarantee forward progress clear __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall into this category where kswapd will still be woken but atomic reserves are not used as there is a one-entry mempool to guarantee progress. o Callers that are checking if they are non-blocking should use the helper gfpflags_allow_blocking() where possible. This is because checking for __GFP_WAIT as was done historically now can trigger false positives. Some exceptions like dm-crypt.c exist where the code intent is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to flag manipulations. o Callers that built their own GFP flags instead of starting with GFP_KERNEL and friends now also need to specify __GFP_KSWAPD_RECLAIM. The first key hazard to watch out for is callers that removed __GFP_WAIT and was depending on access to atomic reserves for inconspicuous reasons. In some cases it may be appropriate for them to use __GFP_HIGH. The second key hazard is callers that assembled their own combination of GFP flags instead of starting with something like GFP_KERNEL. They may now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless if it's missed in most cases as other activity will wake kswapd. Signed-off-by: NMel Gorman <mgorman@techsingularity.net> Acked-by: NVlastimil Babka <vbabka@suse.cz> Acked-by: NMichal Hocko <mhocko@suse.com> Acked-by: NJohannes Weiner <hannes@cmpxchg.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Vitaly Wool <vitalywool@gmail.com> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 03 10月, 2015 2 次提交
-
-
由 Marek Szyprowski 提交于
IOMMU-based dma_mmap() implementation lacked proper support for offset parameter used in mmap call (it always assumed that mapping starts from offset zero). This patch adds support for offset parameter to IOMMU-based implementation. Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com> Cc: stable@vger.kernel.org # v3.6+ Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
由 Marek Szyprowski 提交于
dma_mmap() function in IOMMU-based dma-mapping implementation lacked a check for valid range of mmap parameters (offset and buffer size), what might have caused access beyond the allocated buffer. This patch fixes this issue. Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com> Cc: stable@vger.kernel.org # v3.6+ Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 17 9月, 2015 1 次提交
-
-
由 Andre Przywara 提交于
Commit 96231b26: ("ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE") changed the definition of DMA_ERROR_CODE to use dma_addr_t, which makes the compiler barf on assigning this to an "int" variable on ARM with LPAE enabled: ************* In file included from /src/linux/include/linux/dma-mapping.h:86:0, from /src/linux/arch/arm/mm/dma-mapping.c:21: /src/linux/arch/arm/mm/dma-mapping.c: In function '__iommu_create_mapping': /src/linux/arch/arm/include/asm/dma-mapping.h:16:24: warning: overflow in implicit constant conversion [-Woverflow] #define DMA_ERROR_CODE (~(dma_addr_t)0x0) ^ /src/linux/arch/arm/mm/dma-mapping.c:1252:15: note: in expansion of macro DMA_ERROR_CODE' int i, ret = DMA_ERROR_CODE; ^ ************* Remove the actually unneeded initialization of "ret" in __iommu_create_mapping() and move the variable declaration inside the for-loop to make the scope of this variable more clear. Signed-off-by: NAndre Przywara <andre.przywara@arm.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-
- 11 9月, 2015 1 次提交
-
-
由 Christoph Hellwig 提交于
Since 2009 we have a nice asm-generic header implementing lots of DMA API functions for architectures using struct dma_map_ops, but unfortunately it's still missing a lot of APIs that all architectures still have to duplicate. This series consolidates the remaining functions, although we still need arch opt outs for two of them as a few architectures have very non-standard implementations. This patch (of 5): The coherent DMA allocator works the same over all architectures supporting dma_map operations. This patch consolidates them and converges the minor differences: - the debug_dma helpers are now called from all architectures, including those that were previously missing them - dma_alloc_from_coherent and dma_release_from_coherent are now always called from the generic alloc/free routines instead of the ops dma-mapping-common.h always includes dma-coherent.h to get the defintions for them, or the stubs if the architecture doesn't support this feature - checks for ->alloc / ->free presence are removed. There is only one magic instead of dma_map_ops without them (mic_dma_ops) and that one is x86 only anyway. Besides that only x86 needs special treatment to replace a default devices if none is passed and tweak the gfp_flags. An optional arch hook is provided for that. [linux@roeck-us.net: fix build] [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: NChristoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: NGuenter Roeck <linux@roeck-us.net> Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com> Signed-off-by: NAndrew Morton <akpm@linux-foundation.org> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 17 8月, 2015 1 次提交
-
-
由 Dan Williams 提交于
Coccinelle cleanup to replace open coded sg to physical address translations. This is in preparation for introducing scatterlists that reference __pfn_t. // sg_phys.cocci: convert usage page_to_phys(sg_page(sg)) to sg_phys(sg) // usage: make coccicheck COCCI=sg_phys.cocci MODE=patch virtual patch @@ struct scatterlist *sg; @@ - page_to_phys(sg_page(sg)) + sg->offset + sg_phys(sg) @@ struct scatterlist *sg; @@ - page_to_phys(sg_page(sg)) + sg_phys(sg) & PAGE_MASK Signed-off-by: NDan Williams <dan.j.williams@intel.com> Signed-off-by: NJens Axboe <axboe@fb.com>
-
- 04 8月, 2015 1 次提交
-
-
由 Lorenzo Nava 提交于
This patch allows the use of CMA for DMA coherent memory allocation. At the moment if the input parameter "is_coherent" is set to true the allocation is not made using the CMA, which I think is not the desired behaviour. The patch covers the allocation and free of memory for coherent DMA. Signed-off-by: NLorenzo Nava <lorenx4@gmail.com> Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com> Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
-