提交 · 9c18fcf7ae0ef87f0723dfd74c27d608c7486e0e · openeuler / raspberrypi-kernel

15 4月, 2016 1 次提交

ARM: 8551/2: DMA: Fix kzalloc flags in __dma_alloc · 9c18fcf7

由 Alexandre Courbot 提交于 4月 13, 2016

Commit 19e6e5e5 ("ARM: 8547/1: dma-mapping: store buffer
information") allocates a structure meant for internal buffer management
with the GFP flags of the buffer itself. This can trigger the following
safeguard in the slab/slub allocator:

	if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
		pr_emerg("gfp: %un", flags & GFP_SLAB_BUG_MASK);
		BUG();
	}

Fix this by filtering the flags that make the slab allocator unhappy.
Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com>
Acked-by: NRabin Vincent <rabin@rab.in>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

9c18fcf7

05 3月, 2016 2 次提交

ARM: 8546/1: dma-mapping: refactor to fix coherent+cma+gfp=0 · b4268676

由 Rabin Vincent 提交于 3月 03, 2016

Given a device which uses arm_coherent_dma_ops and on which
dev_get_cma_area(dev) returns non-NULL, the following usage of the DMA
API with gfp=0 results in memory corruption and a memory leak.

 p = dma_alloc_coherent(dev, sz, &dma, 0);
 if (p)
 	dma_free_coherent(dev, sz, p, dma);

The memory leak is because the alloc allocates using
__alloc_simple_buffer() but the free attempts
dma_release_from_contiguous() which does not do free anything since the
page is not in the CMA area.

The memory corruption is because the free calls __dma_remap() on a page
which is backed by only first level page tables.  The
apply_to_page_range() + __dma_update_pte() loop ends up interpreting the
section mapping as an addresses to a second level page table and writing
the new PTE to memory which is not used by page tables.

We don't have access to the GFP flags used for allocation in the free
function.  Fix this by adding allocator backends and using this
information in the free function so that we always use the correct
release routine.

Fixes: 21caf3a7 ("ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMA")
Signed-off-by: NRabin Vincent <rabin.vincent@axis.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

b4268676

ARM: 8547/1: dma-mapping: store buffer information · 19e6e5e5

由 Rabin Vincent 提交于 3月 03, 2016

Keep a list of allocated DMA buffers so that we can store metadata in
alloc() which we later need in free().
Signed-off-by: NRabin Vincent <rabin.vincent@axis.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

19e6e5e5

11 2月, 2016 2 次提交

ARM: 8507/1: dma-mapping: Use DMA_ATTR_ALLOC_SINGLE_PAGES hint to optimize alloc · 14d3ae2e

由 Doug Anderson 提交于 1月 29, 2016

If we know that TLB efficiency will not be an issue when memory is
accessed then it's not terribly important to allocate big chunks of
memory. The whole point of allocating the big chunks was that it would
make TLB usage efficient.

As Marek Szyprowski indicated:
Please note that mapping memory with larger pages significantly
improves performance, especially when IOMMU has a little TLB
cache. This can be easily observed when multimedia devices do
processing of RGB data with 90/270 degree rotation
Image rotation is distinctly an operation that needs to bounce around
through memory, so it makes sense that TLB efficiency is important
there.

Video decoding, on the other hand, is a fairly sequential operation.
During video decoding it's not expected that we'll be jumping all over
memory. Decoding video is also pretty heavy and the TLB misses aren't a
huge deal. Presumably most HW video acceleration users of dma-mapping
will not care about huge pages and will set DMA_ATTR_ALLOC_SINGLE_PAGES.

Allocating big chunks of memory is quite expensive, especially if we're
doing it repeadly and memory is full. In one (out of tree) usage model
it is common that arm_iommu_alloc_attrs() is called 16 times in a row,
each one trying to allocate 4 MB of memory. This is called whenever the
system encounters a new video, which could easily happen while the
memory system is stressed out. In fact, on certain social media
websites that auto-play video and have infinite scrolling, it's quite
common to see not just one of these 16x4MB allocations but 2 or 3 right
after another. Asking the system even to do a small amount of extra
work to give us big chunks in this case is just not a good use of time.

Allocating big chunks of memory is also expensive indirectly. Even if
we ask the system not to do ANY extra work to allocate _our_ memory,
we're still potentially eating up all big chunks in the system.
Presumably there are other users in the system that aren't quite as
flexible and that actually need these big chunks. By eating all the big
chunks we're causing extra work for the rest of the system. We also may
start making other memory allocations fail. While the system may be
robust to such failures (as is the case with dwc2 USB trying to allocate
buffers for Ethernet data and with WiFi trying to allocate buffers for
WiFi data), it is yet another big performance hit.
Signed-off-by: NDouglas Anderson <dianders@chromium.org>
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Tested-by: NJavier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

14d3ae2e

ARM: 8505/1: dma-mapping: Optimize allocation · 33298ef6

由 Doug Anderson 提交于 1月 29, 2016

The __iommu_alloc_buffer() is expected to be called to allocate pretty
sizeable buffers.  Upon simple tests of video I saw it trying to
allocate 4,194,304 bytes.  The function tries to allocate large chunks
in order to optimize IOMMU TLB usage.

The current function is very, very slow.

One problem is the way it keeps trying and trying to allocate big
chunks.  Imagine a very fragmented memory that has 4M free but no
contiguous pages at all.  Further imagine allocating 4M (1024 pages).
We'll do the following memory allocations:
- For page 1:
  - Try to allocate order 10 (no retry)
  - Try to allocate order 9 (no retry)
  - ...
  - Try to allocate order 0 (with retry, but not needed)
- For page 2:
  - Try to allocate order 9 (no retry)
  - Try to allocate order 8 (no retry)
  - ...
  - Try to allocate order 0 (with retry, but not needed)
- ...
- ...

Total number of calls to alloc() calls for this case is:
  sum(int(math.log(i, 2)) + 1 for i in range(1, 1025))
  => 9228

The above is obviously worse case, but given how slow alloc can be we
really want to try to avoid even somewhat bad cases.  I timed the old
code with a device under memory pressure and it wasn't hard to see it
take more than 120 seconds to allocate 4 megs of memory! (NOTE: testing
was done on kernel 3.14, so possibly mainline would behave
differently).

A second problem is that allocating big chunks under memory pressure
when we don't need them is just not a great idea anyway unless we really
need them.  We can make due pretty well with smaller chunks so it's
probably wise to leave bigger chunks for other users once memory
pressure is on.

Let's adjust the allocation like this:

1. If a big chunk fails, stop trying to hard and bump down to lower
   order allocations.
2. Don't try useless orders.  The whole point of big chunks is to
   optimize the TLB and it can really only make use of 2M, 1M, 64K and
   4K sizes.

We'll still tend to eat up a bunch of big chunks, but that might be the
right answer for some users.  A future patch could possibly add a new
DMA_ATTR that would let the caller decide that TLB optimization isn't
important and that we should use smaller chunks.  Presumably this would
be a sane strategy for some callers.
Signed-off-by: NDouglas Anderson <dianders@chromium.org>
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: NRobin Murphy <robin.murphy@arm.com>
Reviewed-by: NTomasz Figa <tfiga@chromium.org>
Tested-by: NJavier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

33298ef6

23 1月, 2016 1 次提交

tree wide: use kvfree() than conditional kfree()/vfree() · 1d5cfdb0

由 Tetsuo Handa 提交于 1月 22, 2016

There are many locations that do

  if (memory_was_allocated_by_vmalloc)
    vfree(ptr);
  else
    kfree(ptr);

but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory
using is_vmalloc_addr().  Unless callers have special reasons, we can
replace this branch with kvfree().  Please check and reply if you found
problems.
Signed-off-by: NTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NJan Kara <jack@suse.com>
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: NAndreas Dilger <andreas.dilger@intel.com>
Acked-by: N"Rafael J. Wysocki" <rjw@rjwysocki.net>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Boris Petkov <bp@suse.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d5cfdb0

16 12月, 2015 1 次提交

Revert "scatterlist: use sg_phys()" · 3e6110fd

由 Dan Williams 提交于 12月 15, 2015

commit db0fa0cb "scatterlist: use sg_phys()" did replacements of
the form:

    phys_addr_t phys = page_to_phys(sg_page(s));
    phys_addr_t phys = sg_phys(s) & PAGE_MASK;

However, this breaks platforms where sizeof(phys_addr_t) >
sizeof(unsigned long).  Revert for 4.3 and 4.4 to make room for a
combined helper in 4.5.

Cc: <stable@vger.kernel.org>
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: db0fa0cb ("scatterlist: use sg_phys()")
Suggested-by: NJoerg Roedel <joro@8bytes.org>
Reported-by: NVitaly Lavrov <vel21ripn@gmail.com>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

3e6110fd

07 11月, 2015 1 次提交

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc

由 Mel Gorman 提交于 11月 06, 2015

mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd

__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts.  They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve".  __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available.  Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative.  High priority users continue to use
__GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
  pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
  __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
  into this category where kswapd will still be woken but atomic reserves
  are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
  helper gfpflags_allow_blocking() where possible. This is because
  checking for __GFP_WAIT as was done historically now can trigger false
  positives. Some exceptions like dm-crypt.c exist where the code intent
  is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
  flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
  and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL.  They may
now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.
Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
Acked-by: NVlastimil Babka <vbabka@suse.cz>
Acked-by: NMichal Hocko <mhocko@suse.com>
Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d0164adc

03 10月, 2015 2 次提交

ARM: 8427/1: dma-mapping: add support for offset parameter in dma_mmap() · 7e312103

由 Marek Szyprowski 提交于 8月 28, 2015

IOMMU-based dma_mmap() implementation lacked proper support for offset
parameter used in mmap call (it always assumed that mapping starts from
offset zero). This patch adds support for offset parameter to IOMMU-based
implementation.
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Cc: stable@vger.kernel.org  # v3.6+
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

7e312103

ARM: 8426/1: dma-mapping: add missing range check in dma_mmap() · 371f0f08

由 Marek Szyprowski 提交于 8月 28, 2015

dma_mmap() function in IOMMU-based dma-mapping implementation lacked
a check for valid range of mmap parameters (offset and buffer size), what
might have caused access beyond the allocated buffer. This patch fixes
this issue.
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Cc: stable@vger.kernel.org  # v3.6+
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

371f0f08

17 9月, 2015 1 次提交

ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definition · 90cde558

由 Andre Przywara 提交于 9月 14, 2015

Commit 96231b26: ("ARM: 8419/1: dma-mapping: harmonize definition
of DMA_ERROR_CODE") changed the definition of DMA_ERROR_CODE to use
dma_addr_t, which makes the compiler barf on assigning this to an
"int" variable on ARM with LPAE enabled:
*************
In file included from /src/linux/include/linux/dma-mapping.h:86:0,
                 from /src/linux/arch/arm/mm/dma-mapping.c:21:
/src/linux/arch/arm/mm/dma-mapping.c: In function '__iommu_create_mapping':
/src/linux/arch/arm/include/asm/dma-mapping.h:16:24: warning:
overflow in implicit constant conversion [-Woverflow]
 #define DMA_ERROR_CODE (~(dma_addr_t)0x0)
                        ^
/src/linux/arch/arm/mm/dma-mapping.c:1252:15: note: in expansion of
macro DMA_ERROR_CODE'
  int i, ret = DMA_ERROR_CODE;
               ^
*************

Remove the actually unneeded initialization of "ret" in
__iommu_create_mapping() and move the variable declaration inside the
for-loop to make the scope of this variable more clear.
Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

90cde558

11 9月, 2015 1 次提交

dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} · 6894258e

由 Christoph Hellwig 提交于 9月 09, 2015

Since 2009 we have a nice asm-generic header implementing lots of DMA API
functions for architectures using struct dma_map_ops, but unfortunately
it's still missing a lot of APIs that all architectures still have to
duplicate.

This series consolidates the remaining functions, although we still need
arch opt outs for two of them as a few architectures have very
non-standard implementations.

This patch (of 5):

The coherent DMA allocator works the same over all architectures supporting
dma_map operations.

This patch consolidates them and converges the minor differences:

 - the debug_dma helpers are now called from all architectures, including
   those that were previously missing them
 - dma_alloc_from_coherent and dma_release_from_coherent are now always
   called from the generic alloc/free routines instead of the ops
   dma-mapping-common.h always includes dma-coherent.h to get the defintions
   for them, or the stubs if the architecture doesn't support this feature
 - checks for ->alloc / ->free presence are removed.  There is only one
   magic instead of dma_map_ops without them (mic_dma_ops) and that one
   is x86 only anyway.

Besides that only x86 needs special treatment to replace a default devices
if none is passed and tweak the gfp_flags.  An optional arch hook is provided
for that.

[linux@roeck-us.net: fix build]
[jcmvbkbc@gmail.com: fix xtensa]
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

6894258e

17 8月, 2015 1 次提交

scatterlist: use sg_phys() · db0fa0cb

由 Dan Williams 提交于 8月 17, 2015

Coccinelle cleanup to replace open coded sg to physical address
translations.  This is in preparation for introducing scatterlists that
reference __pfn_t.

// sg_phys.cocci: convert usage page_to_phys(sg_page(sg)) to sg_phys(sg)
// usage: make coccicheck COCCI=sg_phys.cocci MODE=patch

virtual patch

@@
struct scatterlist *sg;
@@

- page_to_phys(sg_page(sg)) + sg->offset
+ sg_phys(sg)

@@
struct scatterlist *sg;
@@

- page_to_phys(sg_page(sg))
+ sg_phys(sg) & PAGE_MASK
Signed-off-by: NDan Williams <dan.j.williams@intel.com>
Signed-off-by: NJens Axboe <axboe@fb.com>

db0fa0cb

04 8月, 2015 1 次提交

ARM: 8398/1: arm DMA: Fix allocation from CMA for coherent DMA · 21caf3a7

由 Lorenzo Nava 提交于 7月 02, 2015

This patch allows the use of CMA for DMA coherent memory allocation.
At the moment if the input parameter "is_coherent" is set to true
the allocation is not made using the CMA, which I think is not the
desired behaviour.
The patch covers the allocation and free of memory for coherent
DMA.
Signed-off-by: NLorenzo Nava <lorenx4@gmail.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

21caf3a7

02 8月, 2015 1 次提交

ARM: reduce visibility of dmac_* functions · 1234e3fd

由 Russell King 提交于 7月 24, 2015

The dmac_* functions are private to the ARM DMA API implementation, and
should not be used by drivers. In order to discourage their use, remove
their prototypes and macros from asm/*.h.

We have to leave dmac_flush_range() behind as Exynos and MSM IOMMU code
use these; once these sites are fixed, this can be moved also.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

1234e3fd

17 7月, 2015 1 次提交

ARM: 8404/1: dma-mapping: fix off-by-one error in bitmap size check · 462859aa

由 Marek Szyprowski 提交于 7月 08, 2015

nr_bitmaps member of mapping structure stores the number of already
allocated bitmaps and it is interpreted as loop iterator (it starts from
0 not from 1), so a comparison against number of possible bitmap
extensions should include this fact. This patch fixes this by changing
the extension failure condition. This issue has been introduced by
commit 4d852ef8 ("arm: dma-mapping: Add
support to extend DMA IOMMU mappings").
Reported-by: NHyungwon Hwang <human.hwang@samsung.com>
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: NHyungwon Hwang <human.hwang@samsung.com>
Cc: stable@vger.kernel.org  # v3.15+
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

462859aa

06 6月, 2015 1 次提交

ARM: 8387/1: arm/mm/dma-mapping.c: Add arm_coherent_dma_mmap · 55af8a91

由 Mike Looijmans 提交于 6月 03, 2015

When dma-coherent transfers are enabled, the mmap call must
not change the pg_prot flags in the vma struct.

Split the arm_dma_mmap into a common and specific parts,
and add a "arm_coherent_dma_mmap" implementation that does
not alter the page protection flags.

Tested on a topic-miami board (Zynq) using the ACP port
to transfer data between FPGA and CPU using the Dyplo
framework. Without this patch, byte-wise access to mmapped
coherent DMA memory was about 20x slower because of the
memory being marked as non-cacheable, and transfer speeds
would not exceed 240MB/s.

After this patch, the mapped memory is cacheable and the
transfer speed is again 600MB/s (limited by the FPGA) when
the data is in the L2 cache, while data integrity is being
maintained.

The patch has no effect on non-coherent DMA.
Signed-off-by: NMike Looijmans <mike.looijmans@topic.nl>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

55af8a91

04 5月, 2015 1 次提交

ARM: 8347/1: dma-mapping: fix off-by-one check in arm_setup_iommu_dma_ops · 1424532b

由 Marek Szyprowski 提交于 4月 29, 2015

Patch 22b3c181 ("arm: dma-mapping: limit
IOMMU mapping size") added a check for IO address space size. However
this patch broke IOMMU initialization for typical platforms initialized
from device tree, which get the default IO address space size of 4GiB.
This value doesn't fit into size_t and fails a check introduced by that
commit resulting in failed dma-mapping/iommu initialization. This patch
fixes this issue by adding proper support for full 4GiB address space
size.
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

1424532b

02 4月, 2015 1 次提交

ARM: 8337/1: mm: Do not invoke OOM for higher order IOMMU DMA allocations · 49f28aa6

由 Tomasz Figa 提交于 4月 01, 2015

IOMMU should be able to use single pages as well as bigger blocks, so if
higher order allocations fail, we should not affect state of the system,
with events such as OOM killer, but rather fall back to order 0
allocations.

This patch changes the behavior of ARM IOMMU DMA allocator to use
__GFP_NORETRY, which bypasses OOM invocation, for orders higher than
zero and, only if that fails, fall back to normal order 0 allocation
which might invoke OOM killer.
Signed-off-by: NTomasz Figa <tfiga@chromium.org>
Reviewed-by: NDoug Anderson <dianders@chromium.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

49f28aa6

18 3月, 2015 1 次提交

ARM: 8289/1: dma-mapping: use to_dma_iommu_mapping instead of accessing archdata · 89cfdb19

由 Will Deacon 提交于 1月 16, 2015

When using the IOMMU-backed DMA ops for a device, we store a pointer to
the dma_iommu_mapping structure (used to keep track of the address
space) in the archdata.mapping field of the struct device.

Rather than access this field directly, use the to_dma_iommu_mapping
helper in dma-mapping, so that we don't really care where the mapping
information is held.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

89cfdb19

13 3月, 2015 1 次提交

arm: dma-mapping: limit IOMMU mapping size · 22b3c181

由 Murali Karicheri 提交于 3月 03, 2015

arm_iommu_create_mapping() has size parameter of size_t and
arm_setup_iommu_dma_ops() can take a value higher than that
when this is called from the OF code.  So limit the size to
SIZE_MAX.

Tested-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> (AMD Seattle)
Signed-off-by: NMurali Karicheri <m-karicheri2@ti.com>
Signed-off-by: NBjorn Helgaas <bhelgaas@google.com>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
CC: Joerg Roedel <joro@8bytes.org>
CC: Grant Likely <grant.likely@linaro.org>
CC: Rob Herring <robh+dt@kernel.org>
CC: Russell King <linux@arm.linux.org.uk>
CC: Arnd Bergmann <arnd@arndb.de>

22b3c181

11 3月, 2015 1 次提交

ARM: dma-api: fix off-by-one error in __dma_supported() · 8bf1268f

由 Russell King 提交于 3月 10, 2015

When validating the mask against the amount of memory we have available
(so that we can trap 32-bit DMA addresses with >32-bits memory), we had
not taken account of the fact that max_pfn is the maximum PFN number
plus one that would be in the system.

There are several references in the code which bear this out:

mm/page_owner.c:
	for (; pfn < max_pfn; pfn++) {
	}

arch/x86/kernel/setup.c:
	high_memory = (void *)__va(max_pfn * PAGE_SIZE - 1)
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

8bf1268f

23 2月, 2015 1 次提交

ARM: 8304/1: Respect NO_KERNEL_MAPPING when we don't have an IOMMU · 6e8266e3

由 Carlo Caione 提交于 2月 09, 2015

Even without an iommu, NO_KERNEL_MAPPING is still convenient to save on
kernel address space in places where we don't need a kernel mapping.
Implement support for it in the two places where we're creating an
expensive mapping.

__alloc_from_pool uses an internal pool from which we already have
virtual addresses, so it's not relevant, and __alloc_simple_buffer uses
alloc_pages, which will always return a lowmem page, which is already
mapped into kernel space, so we can't prevent a mapping for it in that
case.
Signed-off-by: NJasper St. Pierre <jstpierre@mecheye.net>
Signed-off-by: NCarlo Caione <carlo@caione.org>
Reviewed-by: NRob Clark <robdclark@gmail.com>
Reviewed-by: NDaniel Drake <dsd@endlessm.com>
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

6e8266e3

20 2月, 2015 1 次提交

ARM: 8305/1: DMA: Fix kzalloc flags in __iommu_alloc_buffer() · 23be7fda

由 Alexandre Courbot 提交于 2月 19, 2015

There doesn't seem to be any valid reason to allocate the pages array
with the same flags as the buffer itself. Doing so can eventually lead
to the following safeguard in mm/slab.c's cache_grow() to be hit:

        if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
                pr_emerg("gfp: %un", flags & GFP_SLAB_BUG_MASK);
                BUG();
        }

This happens when buffers are allocated with __GFP_DMA32 or
__GFP_HIGHMEM.

Fix this by allocating the pages array with GFP_KERNEL to follow what is
done elsewhere in this file. Using GFP_KERNEL in __iommu_alloc_buffer()
is safe because atomic allocations are handled by __iommu_alloc_atomic().
Signed-off-by: NAlexandre Courbot <acourbot@nvidia.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

23be7fda

30 1月, 2015 1 次提交

arm: dma-mapping: Set DMA IOMMU ops in arm_iommu_attach_device() · eab8d653

由 Laurent Pinchart 提交于 1月 23, 2015

Commit 4bb25789 ("arm: dma-mapping: plumb our iommu mapping ops
into arch_setup_dma_ops") moved the setting of the DMA operations from
arm_iommu_attach_device() to arch_setup_dma_ops() where the DMA
operations to be used are selected based on whether the device is
connected to an IOMMU. However, the IOMMU detection scheme requires the
IOMMU driver to be ported to the new IOMMU of_xlate API. As no driver
has been ported yet, this effectively breaks all IOMMU ARM users that
depend on the IOMMU being handled transparently by the DMA mapping API.

Fix this by restoring the setting of DMA IOMMU ops in
arm_iommu_attach_device() and splitting the rest of the function into a
new internal __arm_iommu_attach_device() function, called by
arch_setup_dma_ops().
Signed-off-by: NLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Tested-by: NHeiko Stuebner <heiko@sntech.de>
Signed-off-by: NOlof Johansson <olof@lixom.net>

eab8d653

29 1月, 2015 1 次提交

ARM: 8288/1: dma-mapping: don't detach devices without an IOMMU during teardown · c2273a18

由 Will Deacon 提交于 1月 16, 2015

When tearing down the DMA ops for a device via of_dma_deconfigure, we
unconditionally detach the device from its IOMMU domain. For devices
that aren't actually behind an IOMMU, this produces a "Not attached"
warning message on the console.

This patch changes the teardown code so that we don't detach from the
IOMMU domain when there isn't an IOMMU dma mapping to start with.
Reported-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

c2273a18

02 12月, 2014 1 次提交

arm: dma-mapping: plumb our iommu mapping ops into arch_setup_dma_ops · 4bb25789

由 Will Deacon 提交于 8月 27, 2014

This patch plumbs the existing ARM IOMMU DMA infrastructure (which isn't
actually called outside of a few drivers) into arch_setup_dma_ops, so
that we can use IOMMUs for DMA transfers in a more generic fashion.

Since this significantly complicates the arch_setup_dma_ops function,
it is moved out of line into dma-mapping.c. If CONFIG_ARM_DMA_USE_IOMMU
is not set, the iommu parameter is ignored and the normal ops are used
instead.
Acked-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>

4bb25789

30 10月, 2014 1 次提交

ARM: 8181/1: Drop extra return statement · 00575729

由 Laura Abbott 提交于 10月 24, 2014

Commit 513510dd
(common: dma-mapping: introduce common remapping functions)
managed to end up with an extra return statement from the
original patch. Drop it.
Signed-off-by: NLaura Abbott <lauraa@codeaurora.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

00575729

10 10月, 2014 2 次提交

arm: use genalloc for the atomic pool · 36d0fd21

由 Laura Abbott 提交于 10月 09, 2014

ARM currently uses a bitmap for tracking atomic allocations.  genalloc
already handles this type of memory pool allocation so switch to using
that instead.
Signed-off-by: NLaura Abbott <lauraa@codeaurora.org>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Riley <davidriley@chromium.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Ritesh Harjain <ritesh.harjani@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

36d0fd21

common: dma-mapping: introduce common remapping functions · 513510dd

由 Laura Abbott 提交于 10月 09, 2014

For architectures without coherent DMA, memory for DMA may need to be
remapped with coherent attributes.  Factor out the the remapping code from
arm and put it in a common location to reduce code duplication.

As part of this, the arm APIs are now migrated away from
ioremap_page_range to the common APIs which use map_vm_area for remapping.
 This should be an equivalent change and using map_vm_area is more correct
as ioremap_page_range is intended to bring in io addresses into the cpu
space and not regular kernel managed memory.
Signed-off-by: NLaura Abbott <lauraa@codeaurora.org>
Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Riley <davidriley@chromium.org>
Cc: Olof Johansson <olof@lixom.net>
Cc: Ritesh Harjain <ritesh.harjani@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Mitchel Humpherys <mitchelh@codeaurora.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

513510dd

07 8月, 2014 1 次提交

CMA: generalize CMA reserved area management functionality · a254129e

由 Joonsoo Kim 提交于 8月 06, 2014

Currently, there are two users on CMA functionality, one is the DMA
subsystem and the other is the KVM on powerpc.  They have their own code
to manage CMA reserved area even if they looks really similar.  From my
guess, it is caused by some needs on bitmap management.  KVM side wants
to maintain bitmap not for 1 page, but for more size.  Eventually it use
bitmap where one bit represents 64 pages.

When I implement CMA related patches, I should change those two places
to apply my change and it seem to be painful to me.  I want to change
this situation and reduce future code management overhead through this
patch.

This change could also help developer who want to use CMA in their new
feature development, since they can use CMA easily without copying &
pasting this reserved area management code.

In previous patches, we have prepared some features to generalize CMA
reserved area management and now it's time to do it.  This patch moves
core functions to mm/cma.c and change DMA APIs to use these functions.

There is no functional change in DMA APIs.
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: NMichal Nazarewicz <mina86@mina86.com>
Acked-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Gleb Natapov <gleb@kernel.org>
Acked-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Tested-by: NMarek Szyprowski <m.szyprowski@samsung.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a254129e

18 7月, 2014 1 次提交

ARM: DMA: ensure that old section mappings are flushed from the TLB · 6b076991

由 Russell King 提交于 7月 17, 2014

When setting up the CMA region, we must ensure that the old section
mappings are flushed from the TLB before replacing them with page
tables, otherwise we can suffer from mismatched aliases if the CPU
speculatively prefetches from these mappings at an inopportune time.

A mismatched alias can occur when the TLB contains a section mapping,
but a subsequent prefetch causes it to load a page table mapping,
resulting in the possibility of the TLB containing two matching
mappings for the same virtual address region.
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

6b076991

22 5月, 2014 2 次提交

ARM: dma-mapping: avoid calling dma_cache_maint_page() on dev=>cpu · deace4a6

由 Russell King 提交于 5月 03, 2014

Avoid calling dma_cache_maint_page() when unmapping a DMA_TO_DEVICE
buffer.  The L1 cache ops never do anything in this circumstance, nor
do they ever need to - all that matters for this case is that the data
written is visible to the device before DMA starts.  What happens during
the transfer (provided the buffer is not written to) is of no real
consequence.

We already do this optimisation for the L2 cache.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

deace4a6

arm: dma-mapping: add checking cma area initialized · e464ef16

由 Gioh Kim 提交于 5月 22, 2014

If CMA is turned on and CMA size is set to zero, kernel should
behave as if CMA was not enabled at compile time.
Every dma allocation should check existence of cma area
before requesting memory.
Signed-off-by: NGioh Kim <gioh.kim@lge.com>
Signed-off-by: NJoonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: NMichal Nazarewicz <mina86@mina86.com>
[mszyprow: removed redundant empty line from the patch]
Signed-off-by: <m.szyprowski@samsung.com>

e464ef16

20 5月, 2014 1 次提交

arm: dma-iommu: Clean up redundant variable · 006f841d

由 Ritesh Harjani 提交于 5月 20, 2014

mapping->size can be derived from mapping->bits << PAGE_SHIFT
which makes mapping->size as redundant.

Clean this up.
Signed-off-by: NRitesh Harjani <ritesh.harjani@gmail.com>
Reported-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>

006f841d

07 5月, 2014 1 次提交

ARM: dma: use phys_addr_t in __dma_page_[cpu_to_dev/dev_to_cpu] · 2161c248

由 Santosh Shilimkar 提交于 4月 24, 2014

On a 32 bit ARM architecture with LPAE extension physical addresses
cannot fit into unsigned long variable.

So fix it by using phys_addr_t instead of unsigned long.

Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NSantosh Shilimkar <santosh.shilimkar@ti.com>

2161c248

23 4月, 2014 1 次提交

arm: dma-mapping: Fix mapping size value · 59f0f119

由 Ritesh Harjani 提交于 4月 21, 2014

68efd7d2("arm: dma-mapping: remove order parameter from
arm_iommu_create_mapping()") is causing kernel panic
because it wrongly sets the value of mapping->size:

Unable to handle kernel NULL pointer dereference at virtual
address 000000a0
pgd = e7a84000
[000000a0] *pgd=00000000
...
PC is at bitmap_clear+0x48/0xd0
LR is at __iommu_remove_mapping+0x130/0x164

Fix it by correcting mapping->size value.
Signed-off-by: NRitesh Harjani <ritesh.harjani@gmail.com>
Acked-by: NLaurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>

59f0f119

28 2月, 2014 2 次提交

arm: dma-mapping: remove order parameter from arm_iommu_create_mapping() · 68efd7d2

由 Marek Szyprowski 提交于 2月 25, 2014

The 'order' parameter for IOMMU-aware dma-mapping implementation was
introduced mainly as a hack to reduce size of the bitmap used for
tracking IO virtual address space. Since now it is possible to dynamically
resize the bitmap, this hack is not needed and can be removed without any
impact on the client devices. This way the parameters for
arm_iommu_create_mapping() becomes much easier to understand. 'size'
parameter now means the maximum supported IO address space size.

The code will allocate (resize) bitmap in chunks, ensuring that a single
chunk is not larger than a single memory page to avoid unreliable
allocations of size larger than PAGE_SIZE in atomic context.
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>

68efd7d2

arm: dma-mapping: Add support to extend DMA IOMMU mappings · 4d852ef8

由 Andreas Herrmann 提交于 2月 25, 2014

Instead of using just one bitmap to keep track of IO virtual addresses
(handed out for IOMMU use) introduce an array of bitmaps. This allows
us to extend existing mappings when running out of iova space in the
initial mapping etc.

If there is not enough space in the mapping to service an IO virtual
address allocation request, __alloc_iova() tries to extend the mapping
-- by allocating another bitmap -- and makes another allocation
attempt using the freshly allocated bitmap.

This allows arm iommu drivers to start with a decent initial size when
an dma_iommu_mapping is created and still to avoid running out of IO
virtual addresses for the mapping.
Signed-off-by: NAndreas Herrmann <andreas.herrmann@calxeda.com>
[mszyprow: removed extensions parameter to arm_iommu_create_mapping()
 function, which will be modified in the next patch anyway, also some
 debug messages about extending bitmap]
Signed-off-by: NMarek Szyprowski <m.szyprowski@samsung.com>

4d852ef8

19 2月, 2014 1 次提交

ARM: 7979/1: mm: Remove hugetlb warning from Coherent DMA allocator · 6ea41c80

由 Steven Capper 提交于 2月 18, 2014

The Coherant DMA allocator allocates pages of high order then splits
them up into smaller pages.

This splitting logic would run into problems if the allocator was
given compound pages. Thus the Coherant DMA allocator was originally
incompatible with compound pages existing and, by extension, huge
pages. A compile #error was put in place whenever huge pages were
enabled.

Compatibility with compound pages has since been introduced by the
following commit (which merely excludes GFP_COMP pages from being
requested by the coherant DMA allocator):
  ea2e7057 ARM: 7172/1: dma: Drop GFP_COMP for DMA memory allocations

When huge page support was introduced to ARM, the compile #error in
dma-mapping.c was replaced by a #warning when it should have been
removed instead.

This patch removes the compile #warning in dma-mapping.c when huge
pages are enabled.
Signed-off-by: NSteve Capper <steve.capper@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

6ea41c80