1. 03 12月, 2015 1 次提交
    • W
      ARM: 8465/1: mm: keep reserved ASIDs in sync with mm after multiple rollovers · 40ee068e
      Will Deacon 提交于
      Under some unusual context-switching patterns, it is possible to end up
      with multiple threads from the same mm running concurrently with
      different ASIDs:
      
      1. CPU x schedules task t with mm p containing ASID a and generation g
         This task doesn't block and the CPU doesn't context switch.
         So:
           * per_cpu(active_asid, x) = {g,a}
           * p->context.id = {g,a}
      
      2. Some other CPU generates an ASID rollover. The global generation is
         now (g + 1). CPU x is still running t, with no context switch and
         so per_cpu(reserved_asid, x) = {g,a}
      
      3. CPU y schedules task t', which shares mm p with t. The generation
         mismatches, so we take the slowpath and hit the reserved ASID from
         CPU x. p is then updated so that p->context.id = {g + 1,a}
      
      4. CPU y schedules some other task u, which has an mm != p.
      
      5. Some other CPU generates *another* CPU rollover. The global
         generation is now (g + 2). CPU x is still running t, with no context
         switch and so per_cpu(reserved_asid, x) = {g,a}.
      
      6. CPU y once again schedules task t', but now *fails* to hit the
         reserved ASID from CPU x because of the generation mismatch. This
         results in a new ASID being allocated, despite the fact that t is
         still running on CPU x with the same mm.
      
      Consequently, TLBIs (e.g. as a result of CoW) will not be synchronised
      between the two threads.
      
      This patch fixes the problem by updating all of the matching reserved
      ASIDs when we hit on the slowpath (i.e. in step 3 above). This keeps
      the reserved ASIDs in-sync with the mm and avoids the problem.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: NTony Thompson <anthony.thompson@arm.com>
      Reviewed-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      40ee068e
  2. 10 11月, 2015 1 次提交
  3. 07 11月, 2015 1 次提交
    • M
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman 提交于
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Acked-by: NJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  4. 06 11月, 2015 1 次提交
    • A
      uaccess: reimplement probe_kernel_address() using probe_kernel_read() · 0ab32b6f
      Andrew Morton 提交于
      probe_kernel_address() is basically the same as the (later added)
      probe_kernel_read().
      
      The return value on EFAULT is a bit different: probe_kernel_address()
      returns number-of-bytes-not-copied whereas probe_kernel_read() returns
      -EFAULT.  All callers have been checked, none cared.
      
      probe_kernel_read() can be overridden by the architecture whereas
      probe_kernel_address() cannot.  parisc, blackfin and um do this, to insert
      additional checking.  Hence this patch possibly fixes obscure bugs,
      although there are only two probe_kernel_address() callsites outside
      arch/.
      
      My first attempt involved removing probe_kernel_address() entirely and
      converting all callsites to use probe_kernel_read() directly, but that got
      tiresome.
      
      This patch shrinks mm/slab_common.o by 218 bytes.  For a single
      probe_kernel_address() callsite.
      
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ab32b6f
  5. 27 10月, 2015 1 次提交
  6. 20 10月, 2015 1 次提交
  7. 03 10月, 2015 3 次提交
  8. 24 9月, 2015 1 次提交
    • R
      ARM: alignment: fix alignment handling for uaccess changes · 274e91b8
      Russell King 提交于
      Jonathan Liu reports that the recent addition of CPU_SW_DOMAIN_PAN
      causes wpa_supplicant to die due to the following kernel oops:
      
      Unhandled fault: page domain fault (0x81b) at 0x001017a2
      pgd = ee1b8000
      [001017a2] *pgd=6ebee831, *pte=6c35475f, *ppte=6c354c7f
      Internal error: : 81b [#1] SMP ARM
      Modules linked in: rt2800usb rt2x00usb rt2800librt2x00lib crc_ccitt mac80211
      CPU: 1 PID: 202 Comm: wpa_supplicant Not tainted 4.3.0-rc2 #1
      Hardware name: Allwinner sun7i (A20) Family
      task: ec872f80 ti: ee364000 task.ti: ee364000
      PC is at do_alignment_ldmstm+0x1d4/0x238
      LR is at 0x0
      pc : [<c001d1d8>]    lr : [<00000000>]    psr: 600c0113
      sp : ee365e18  ip : 00000000  fp : 00000002
      r10: 001017a2  r9 : 00000002  r8 : 001017aa
      r7 : ee365fb0  r6 : e8820018  r5 : 001017a2  r4 : 00000003
      r3 : d49e30e0  r2 : 00000000  r1 : ee365fbc  r0 : 00000000
      Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none[   34.393106] Control: 10c5387d  Table: 6e1b806a  DAC: 00000051
      Process wpa_supplicant (pid: 202, stack limit = 0xee364210)
      Stack: (0xee365e18 to 0xee366000)
      ...
      [<c001d1d8>] (do_alignment_ldmstm) from [<c001d510>] (do_alignment+0x1f0/0x904)
      [<c001d510>] (do_alignment) from [<c00092a0>] (do_DataAbort+0x38/0xb4)
      [<c00092a0>] (do_DataAbort) from [<c0013d7c>] (__dabt_usr+0x3c/0x40)
      Exception stack(0xee365fb0 to 0xee365ff8)
      5fa0:                                     00000000 56c728c0 001017a2 d49e30e0
      5fc0: 775448d2 597d4e74 00200800 7a9e1625 00802001 00000021 b6deec84 00000100
      5fe0: 08020200 be9f4f20 0c0b0d0a b6d9b3e0 600c0010 ffffffff
      Code: e1a0a005 e1a0000c 1affffe8 e5913000 (e4ea3001)
      ---[ end trace 0acd3882fcfdf9dd ]---
      
      This is caused by the alignment handler not being fixed up for the
      uaccess changes, and userspace issuing an unaligned LDM instruction.
      So, fix the problem by adding the necessary fixups.
      Reported-by: NJonathan Liu <net147@gmail.com>
      Tested-by: NJonathan Liu <net147@gmail.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      274e91b8
  9. 22 9月, 2015 1 次提交
  10. 17 9月, 2015 1 次提交
    • A
      ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definition · 90cde558
      Andre Przywara 提交于
      Commit 96231b26: ("ARM: 8419/1: dma-mapping: harmonize definition
      of DMA_ERROR_CODE") changed the definition of DMA_ERROR_CODE to use
      dma_addr_t, which makes the compiler barf on assigning this to an
      "int" variable on ARM with LPAE enabled:
      *************
      In file included from /src/linux/include/linux/dma-mapping.h:86:0,
                       from /src/linux/arch/arm/mm/dma-mapping.c:21:
      /src/linux/arch/arm/mm/dma-mapping.c: In function '__iommu_create_mapping':
      /src/linux/arch/arm/include/asm/dma-mapping.h:16:24: warning:
      overflow in implicit constant conversion [-Woverflow]
       #define DMA_ERROR_CODE (~(dma_addr_t)0x0)
                              ^
      /src/linux/arch/arm/mm/dma-mapping.c:1252:15: note: in expansion of
      macro DMA_ERROR_CODE'
        int i, ret = DMA_ERROR_CODE;
                     ^
      *************
      
      Remove the actually unneeded initialization of "ret" in
      __iommu_create_mapping() and move the variable declaration inside the
      for-loop to make the scope of this variable more clear.
      Signed-off-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      90cde558
  11. 11 9月, 2015 1 次提交
    • C
      dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} · 6894258e
      Christoph Hellwig 提交于
      Since 2009 we have a nice asm-generic header implementing lots of DMA API
      functions for architectures using struct dma_map_ops, but unfortunately
      it's still missing a lot of APIs that all architectures still have to
      duplicate.
      
      This series consolidates the remaining functions, although we still need
      arch opt outs for two of them as a few architectures have very
      non-standard implementations.
      
      This patch (of 5):
      
      The coherent DMA allocator works the same over all architectures supporting
      dma_map operations.
      
      This patch consolidates them and converges the minor differences:
      
       - the debug_dma helpers are now called from all architectures, including
         those that were previously missing them
       - dma_alloc_from_coherent and dma_release_from_coherent are now always
         called from the generic alloc/free routines instead of the ops
         dma-mapping-common.h always includes dma-coherent.h to get the defintions
         for them, or the stubs if the architecture doesn't support this feature
       - checks for ->alloc / ->free presence are removed.  There is only one
         magic instead of dma_map_ops without them (mic_dma_ops) and that one
         is x86 only anyway.
      
      Besides that only x86 needs special treatment to replace a default devices
      if none is passed and tweak the gfp_flags.  An optional arch hook is provided
      for that.
      
      [linux@roeck-us.net: fix build]
      [jcmvbkbc@gmail.com: fix xtensa]
      Signed-off-by: NChristoph Hellwig <hch@lst.de>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Signed-off-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6894258e
  12. 27 8月, 2015 1 次提交
    • R
      ARM: entry: provide uaccess assembly macro hooks · 2190fed6
      Russell King 提交于
      Provide hooks into the kernel entry and exit paths to permit control
      of userspace visibility to the kernel.  The intended use is:
      
      - on entry to kernel from user, uaccess_disable will be called to
        disable userspace visibility
      - on exit from kernel to user, uaccess_enable will be called to
        enable userspace visibility
      - on entry from a kernel exception, uaccess_save_and_disable will be
        called to save the current userspace visibility setting, and disable
        access
      - on exit from a kernel exception, uaccess_restore will be called to
        restore the userspace visibility as it was before the exception
        occurred.
      
      These hooks allows us to keep userspace visibility disabled for the
      vast majority of the kernel, except for localised regions where we
      want to explicitly access userspace.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      2190fed6
  13. 25 8月, 2015 1 次提交
    • R
      ARM: mm: improve do_ldrd_abort macro · 08446b12
      Russell King 提交于
      Improve the do_ldrd_abort macro code - firstly, it inefficiently checks
      for the LDRD encoding by doing a multi-stage test of various bits.  This
      can be simplified by generating a mask, bitmasking the instruction and
      then comparing the result.
      
      Secondly, we want to be able to test the result rather than branching
      to do_DataAbort, so remove the branch at the end and rename the macro
      to 'teq_ldrd' to reflect it's new usage.  teq_ldrd macro returns 'eq'
      if the instruction was a LDRD.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      08446b12
  14. 21 8月, 2015 1 次提交
  15. 18 8月, 2015 2 次提交
  16. 17 8月, 2015 1 次提交
    • D
      scatterlist: use sg_phys() · db0fa0cb
      Dan Williams 提交于
      Coccinelle cleanup to replace open coded sg to physical address
      translations.  This is in preparation for introducing scatterlists that
      reference __pfn_t.
      
      // sg_phys.cocci: convert usage page_to_phys(sg_page(sg)) to sg_phys(sg)
      // usage: make coccicheck COCCI=sg_phys.cocci MODE=patch
      
      virtual patch
      
      @@
      struct scatterlist *sg;
      @@
      
      - page_to_phys(sg_page(sg)) + sg->offset
      + sg_phys(sg)
      
      @@
      struct scatterlist *sg;
      @@
      
      - page_to_phys(sg_page(sg))
      + sg_phys(sg) & PAGE_MASK
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      db0fa0cb
  17. 04 8月, 2015 1 次提交
  18. 02 8月, 2015 1 次提交
    • R
      ARM: reduce visibility of dmac_* functions · 1234e3fd
      Russell King 提交于
      The dmac_* functions are private to the ARM DMA API implementation, and
      should not be used by drivers.  In order to discourage their use, remove
      their prototypes and macros from asm/*.h.
      
      We have to leave dmac_flush_range() behind as Exynos and MSM IOMMU code
      use these; once these sites are fixed, this can be moved also.
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      1234e3fd
  19. 25 7月, 2015 2 次提交
    • R
      ARM: add soc memory barrier extension · 4e1f8a6f
      Russell King 提交于
      Add an extension to the heavy barrier code to allow a SoC specific
      memory barrier function to be provided.  This is needed for platforms
      where the interconnect has weak ordering, and thus needs assistance
      to ensure that memory writes are properly visible in the correct order
      to other parts of the system.
      Acked-by: NTony Lindgren <tony@atomide.com>
      Acked-by: NRichard Woodruff <r-woodruff2@ti.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      4e1f8a6f
    • R
      ARM: move heavy barrier support out of line · f8130906
      Russell King 提交于
      The existing memory barrier macro causes a significant amount of code
      to be inserted inline at every call site.  For example, in
      gpio_set_irq_type(), we have this for mb():
      
      c0344c08:       f57ff04e        dsb     st
      c0344c0c:       e59f8190        ldr     r8, [pc, #400]  ; c0344da4 <gpio_set_irq_type+0x230>
      c0344c10:       e3590004        cmp     r9, #4
      c0344c14:       e5983014        ldr     r3, [r8, #20]
      c0344c18:       0a000054        beq     c0344d70 <gpio_set_irq_type+0x1fc>
      c0344c1c:       e3530000        cmp     r3, #0
      c0344c20:       0a000004        beq     c0344c38 <gpio_set_irq_type+0xc4>
      c0344c24:       e50b2030        str     r2, [fp, #-48]  ; 0xffffffd0
      c0344c28:       e50bc034        str     ip, [fp, #-52]  ; 0xffffffcc
      c0344c2c:       e12fff33        blx     r3
      c0344c30:       e51bc034        ldr     ip, [fp, #-52]  ; 0xffffffcc
      c0344c34:       e51b2030        ldr     r2, [fp, #-48]  ; 0xffffffd0
      c0344c38:       e5963004        ldr     r3, [r6, #4]
      
      Moving the outer_cache_sync() call out of line reduces the impact of
      the barrier:
      
      c0344968:       f57ff04e        dsb     st
      c034496c:       e35a0004        cmp     sl, #4
      c0344970:       e50b2030        str     r2, [fp, #-48]  ; 0xffffffd0
      c0344974:       0a000044        beq     c0344a8c <gpio_set_irq_type+0x1b8>
      c0344978:       ebf363dd        bl      c001d8f4 <arm_heavy_mb>
      c034497c:       e5953004        ldr     r3, [r5, #4]
      
      This should reduce the cache footprint of this code.  Overall, this
      results in a reduction of around 20K in the kernel size:
      
          text    data      bss      dec     hex filename
      10773970  667392 10369656 21811018 14ccf4a ../build/imx6/vmlinux-old
      10754219  667392 10369656 21791267 14c8223 ../build/imx6/vmlinux-new
      
      Another advantage to this approach is that we can finally resolve the
      issue of SoCs which have their own memory barrier requirements within
      multiplatform kernels (such as OMAP.)  Here, the bus interconnects
      need additional handling to ensure that writes become visible in the
      correct order (eg, between dma_map() operations, writes to DMA
      coherent memory, and MMIO accesses.)
      Acked-by: NTony Lindgren <tony@atomide.com>
      Acked-by: NRichard Woodruff <r-woodruff2@ti.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      f8130906
  20. 17 7月, 2015 2 次提交
  21. 10 7月, 2015 1 次提交
    • G
      ARM: 8395/1: l2c: Add support for the "arm,shared-override" property · eeedcea6
      Geert Uytterhoeven 提交于
      "CoreLink Level 2 Cache Controller L2C-310", p. 2-15, section 2.3.2
      Shareable attribute" states:
      
          "The default behavior of the cache controller with respect to the
           shareable attribute is to transform Normal Memory Non-cacheable
           transactions into:
              - cacheable no allocate for reads
              - write through no write allocate for writes."
      
      Depending on the system architecture, this may cause memory corruption
      in the presence of bus mastering devices (e.g. OHCI). To avoid such
      corruption, the default behavior can be disabled by setting the Shared
      Override bit in the Auxiliary Control register.
      
      Currently the Shared Override bit can be set only using C code:
        - by calling l2x0_init() directly, which is deprecated,
        - by setting/clearing the bit in the machine_desc.l2c_aux_val/mask
          fields, but using values differing from 0/~0 is also deprecated.
      
      Hence add support for an "arm,shared-override" device tree property for
      the l2c device node. By specifying this property, affected systems can
      indicate that non-cacheable transactions must not be transformed.
      Then, it's up to the OS to decide. The current behavior is to set the
      "shared attribute override enable" bit, as there may exist kernel linear
      mappings and cacheable aliases for the DMA buffers, even if CMA is
      enabled.
      
      See also commit 1a8e41cd ("ARM: 6395/1: VExpress: Set bit 22 in
      the PL310 (cache controller) AuxCtlr register"):
      
          "Clearing bit 22 in the PL310 Auxiliary Control register (shared
           attribute override enable) has the side effect of transforming
           Normal Shared Non-cacheable reads into Cacheable no-allocate reads.
      
           Coherent DMA buffers in Linux always have a Cacheable alias via the
           kernel linear mapping and the processor can speculatively load
           cache lines into the PL310 controller. With bit 22 cleared,
           Non-cacheable reads would unexpectedly hit such cache lines leading
           to buffer corruption."
      Signed-off-by: NGeert Uytterhoeven <geert+renesas@glider.be>
      Acked-by: NCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      eeedcea6
  22. 04 7月, 2015 1 次提交
  23. 29 6月, 2015 2 次提交
  24. 25 6月, 2015 1 次提交
    • Z
      mm/hugetlb: reduce arch dependent code about huge_pmd_unshare · e81f2d22
      Zhang Zhen 提交于
      Currently we have many duplicates in definitions of huge_pmd_unshare.  In
      all architectures this function just returns 0 when
      CONFIG_ARCH_WANT_HUGE_PMD_SHARE is N.
      
      This patch puts the default implementation in mm/hugetlb.c and lets these
      architectures use the common code.
      Signed-off-by: NZhang Zhen <zhenzhang.zhang@huawei.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: James Yang <James.Yang@freescale.com>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e81f2d22
  25. 11 6月, 2015 1 次提交
  26. 06 6月, 2015 1 次提交
    • M
      ARM: 8387/1: arm/mm/dma-mapping.c: Add arm_coherent_dma_mmap · 55af8a91
      Mike Looijmans 提交于
      When dma-coherent transfers are enabled, the mmap call must
      not change the pg_prot flags in the vma struct.
      
      Split the arm_dma_mmap into a common and specific parts,
      and add a "arm_coherent_dma_mmap" implementation that does
      not alter the page protection flags.
      
      Tested on a topic-miami board (Zynq) using the ACP port
      to transfer data between FPGA and CPU using the Dyplo
      framework. Without this patch, byte-wise access to mmapped
      coherent DMA memory was about 20x slower because of the
      memory being marked as non-cacheable, and transfer speeds
      would not exceed 240MB/s.
      
      After this patch, the mapped memory is cacheable and the
      transfer speed is again 600MB/s (limited by the FPGA) when
      the data is in the L2 cache, while data integrity is being
      maintained.
      
      The patch has no effect on non-coherent DMA.
      Signed-off-by: NMike Looijmans <mike.looijmans@topic.nl>
      Acked-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      55af8a91
  27. 02 6月, 2015 8 次提交