1. 09 5月, 2018 5 次提交
  2. 22 3月, 2018 1 次提交
  3. 15 1月, 2018 1 次提交
  4. 07 1月, 2018 1 次提交
  5. 18 11月, 2017 1 次提交
  6. 14 11月, 2017 1 次提交
    • R
      kconfig: kill off GENERIC_IO option · 9de8da47
      Rob Herring 提交于
      The GENERIC_IO option is set for every architecture except tile and score
      as those define NO_IOMEM. The option only controls visibility of
      CONFIG_MTD which doesn't appear to be necessary for any reason, so let's
      just remove GENERIC_IO.
      Signed-off-by: NRob Herring <robh@kernel.org>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
      Cc: Marek Vasut <marek.vasut@gmail.com>
      Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr>
      Cc: user-mode-linux-devel@lists.sourceforge.net
      Cc: user-mode-linux-user@lists.sourceforge.net
      Cc: linux-mtd@lists.infradead.org
      Acked-by: NRichard Weinberger <richard@nod.at>
      Acked-by: NBoris Brezillon <boris.brezillon@free-electrons.com>
      Signed-off-by: NRichard Weinberger <richard@nod.at>
      9de8da47
  7. 26 9月, 2017 1 次提交
  8. 09 9月, 2017 1 次提交
  9. 01 9月, 2017 1 次提交
    • R
      libnvdimm, nd_blk: remove mmio_flush_range() · 5deb67f7
      Robin Murphy 提交于
      mmio_flush_range() suffers from a lack of clearly-defined semantics,
      and is somewhat ambiguous to port to other architectures where the
      scope of the writeback implied by "flush" and ordering might matter,
      but MMIO would tend to imply non-cacheable anyway. Per the rationale
      in 67a3e8fe ("nd_blk: change aperture mapping from WC to WB"), the
      only existing use is actually to invalidate clean cache lines for
      ARCH_MEMREMAP_PMEM type mappings *without* writeback. Since the recent
      cleanup of the pmem API, that also now happens to be the exact purpose
      of arch_invalidate_pmem(), which would be a far more well-defined tool
      for the job.
      
      Rather than risk potentially inconsistent implementations of
      mmio_flush_range() for the sake of one callsite, streamline things by
      removing it entirely and instead move the ARCH_MEMREMAP_PMEM related
      definitions up to the libnvdimm level, so they can be shared by NFIT
      as well. This allows NFIT to be enabled for arm64.
      Signed-off-by: NRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      5deb67f7
  10. 16 8月, 2017 2 次提交
    • N
      lib: Add zstd modules · 73f3d1b4
      Nick Terrell 提交于
      Add zstd compression and decompression kernel modules.
      zstd offers a wide varity of compression speed and quality trade-offs.
      It can compress at speeds approaching lz4, and quality approaching lzma.
      zstd decompressions at speeds more than twice as fast as zlib, and
      decompression speed remains roughly the same across all compression levels.
      
      The code was ported from the upstream zstd source repository. The
      `linux/zstd.h` header was modified to match linux kernel style.
      The cross-platform and allocation code was stripped out. Instead zstd
      requires the caller to pass a preallocated workspace. The source files
      were clang-formatted [1] to match the Linux Kernel style as much as
      possible. Otherwise, the code was unmodified. We would like to avoid
      as much further manual modification to the source code as possible, so it
      will be easier to keep the kernel zstd up to date.
      
      I benchmarked zstd compression as a special character device. I ran zstd
      and zlib compression at several levels, as well as performing no
      compression, which measure the time spent copying the data to kernel space.
      Data is passed to the compresser 4096 B at a time. The benchmark file is
      located in the upstream zstd source repository under
      `contrib/linux-kernel/zstd_compress_test.c` [2].
      
      I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
      The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
      16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is
      211,988,480 B large. Run the following commands for the benchmark:
      
          sudo modprobe zstd_compress_test
          sudo mknod zstd_compress_test c 245 0
          sudo cp silesia.tar zstd_compress_test
      
      The time is reported by the time of the userland `cp`.
      The MB/s is computed with
      
          1,536,217,008 B / time(buffer size, hash)
      
      which includes the time to copy from userland.
      The Adjusted MB/s is computed with
      
          1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
      
      The memory reported is the amount of memory the compressor requests.
      
      | Method   | Size (B) | Time (s) | Ratio | MB/s    | Adj MB/s | Mem (MB) |
      |----------|----------|----------|-------|---------|----------|----------|
      | none     | 11988480 |    0.100 |     1 | 2119.88 |        - |        - |
      | zstd -1  | 73645762 |    1.044 | 2.878 |  203.05 |   224.56 |     1.23 |
      | zstd -3  | 66988878 |    1.761 | 3.165 |  120.38 |   127.63 |     2.47 |
      | zstd -5  | 65001259 |    2.563 | 3.261 |   82.71 |    86.07 |     2.86 |
      | zstd -10 | 60165346 |   13.242 | 3.523 |   16.01 |    16.13 |    13.22 |
      | zstd -15 | 58009756 |   47.601 | 3.654 |    4.45 |     4.46 |    21.61 |
      | zstd -19 | 54014593 |  102.835 | 3.925 |    2.06 |     2.06 |    60.15 |
      | zlib -1  | 77260026 |    2.895 | 2.744 |   73.23 |    75.85 |     0.27 |
      | zlib -3  | 72972206 |    4.116 | 2.905 |   51.50 |    52.79 |     0.27 |
      | zlib -6  | 68190360 |    9.633 | 3.109 |   22.01 |    22.24 |     0.27 |
      | zlib -9  | 67613382 |   22.554 | 3.135 |    9.40 |     9.44 |     0.27 |
      
      I benchmarked zstd decompression using the same method on the same machine.
      The benchmark file is located in the upstream zstd repo under
      `contrib/linux-kernel/zstd_decompress_test.c` [4]. The memory reported is
      the amount of memory required to decompress data compressed with the given
      compression level. If you know the maximum size of your input, you can
      reduce the memory usage of decompression irrespective of the compression
      level.
      
      | Method   | Time (s) | MB/s    | Adjusted MB/s | Memory (MB) |
      |----------|----------|---------|---------------|-------------|
      | none     |    0.025 | 8479.54 |             - |           - |
      | zstd -1  |    0.358 |  592.15 |        636.60 |        0.84 |
      | zstd -3  |    0.396 |  535.32 |        571.40 |        1.46 |
      | zstd -5  |    0.396 |  535.32 |        571.40 |        1.46 |
      | zstd -10 |    0.374 |  566.81 |        607.42 |        2.51 |
      | zstd -15 |    0.379 |  559.34 |        598.84 |        4.61 |
      | zstd -19 |    0.412 |  514.54 |        547.77 |        8.80 |
      | zlib -1  |    0.940 |  225.52 |        231.68 |        0.04 |
      | zlib -3  |    0.883 |  240.08 |        247.07 |        0.04 |
      | zlib -6  |    0.844 |  251.17 |        258.84 |        0.04 |
      | zlib -9  |    0.837 |  253.27 |        287.64 |        0.04 |
      
      Tested in userland using the test-suite in the zstd repo under
      `contrib/linux-kernel/test/UserlandTest.cpp` [5] by mocking the kernel
      functions. Fuzz tested using libfuzzer [6] with the fuzz harnesses under
      `contrib/linux-kernel/test/{RoundTripCrash.c,DecompressCrash.c}` [7] [8]
      with ASAN, UBSAN, and MSAN. Additionaly, it was tested while testing the
      BtrFS and SquashFS patches coming next.
      
      [1] https://clang.llvm.org/docs/ClangFormat.html
      [2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_compress_test.c
      [3] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
      [4] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_decompress_test.c
      [5] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/UserlandTest.cpp
      [6] http://llvm.org/docs/LibFuzzer.html
      [7] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/RoundTripCrash.c
      [8] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/DecompressCrash.c
      
      zstd source repository: https://github.com/facebook/zstdSigned-off-by: NNick Terrell <terrelln@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      73f3d1b4
    • N
      lib: Add xxhash module · 5d240522
      Nick Terrell 提交于
      Adds xxhash kernel module with xxh32 and xxh64 hashes. xxhash is an
      extremely fast non-cryptographic hash algorithm for checksumming.
      The zstd compression and decompression modules added in the next patch
      require xxhash. I extracted it out from zstd since it is useful on its
      own. I copied the code from the upstream XXHash source repository and
      translated it into kernel style. I ran benchmarks and tests in the kernel
      and tests in userland.
      
      I benchmarked xxhash as a special character device. I ran in four modes,
      no-op, xxh32, xxh64, and crc32. The no-op mode simply copies the data to
      kernel space and ignores it. The xxh32, xxh64, and crc32 modes compute
      hashes on the copied data. I also ran it with four different buffer sizes.
      The benchmark file is located in the upstream zstd source repository under
      `contrib/linux-kernel/xxhash_test.c` [1].
      
      I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
      The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
      16 GB of RAM, and a SSD. I benchmarked using the file `filesystem.squashfs`
      from `ubuntu-16.10-desktop-amd64.iso`, which is 1,536,217,088 B large.
      Run the following commands for the benchmark:
      
          modprobe xxhash_test
          mknod xxhash_test c 245 0
          time cp filesystem.squashfs xxhash_test
      
      The time is reported by the time of the userland `cp`.
      The GB/s is computed with
      
          1,536,217,008 B / time(buffer size, hash)
      
      which includes the time to copy from userland.
      The Normalized GB/s is computed with
      
          1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
      
      | Buffer Size (B) | Hash  | Time (s) | GB/s | Adjusted GB/s |
      |-----------------|-------|----------|------|---------------|
      |            1024 | none  |    0.408 | 3.77 |             - |
      |            1024 | xxh32 |    0.649 | 2.37 |          6.37 |
      |            1024 | xxh64 |    0.542 | 2.83 |         11.46 |
      |            1024 | crc32 |    1.290 | 1.19 |          1.74 |
      |            4096 | none  |    0.380 | 4.04 |             - |
      |            4096 | xxh32 |    0.645 | 2.38 |          5.79 |
      |            4096 | xxh64 |    0.500 | 3.07 |         12.80 |
      |            4096 | crc32 |    1.168 | 1.32 |          1.95 |
      |            8192 | none  |    0.351 | 4.38 |             - |
      |            8192 | xxh32 |    0.614 | 2.50 |          5.84 |
      |            8192 | xxh64 |    0.464 | 3.31 |         13.60 |
      |            8192 | crc32 |    1.163 | 1.32 |          1.89 |
      |           16384 | none  |    0.346 | 4.43 |             - |
      |           16384 | xxh32 |    0.590 | 2.60 |          6.30 |
      |           16384 | xxh64 |    0.466 | 3.30 |         12.80 |
      |           16384 | crc32 |    1.183 | 1.30 |          1.84 |
      
      Tested in userland using the test-suite in the zstd repo under
      `contrib/linux-kernel/test/XXHashUserlandTest.cpp` [2] by mocking the
      kernel functions. A line in each branch of every function in `xxhash.c`
      was commented out to ensure that the test-suite fails. Additionally
      tested while testing zstd and with SMHasher [3].
      
      [1] https://phabricator.intern.facebook.com/P57526246
      [2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/XXHashUserlandTest.cpp
      [3] https://github.com/aappleby/smhasher
      
      zstd source repository: https://github.com/facebook/zstd
      XXHash source repository: https://github.com/cyan4973/xxhashSigned-off-by: NNick Terrell <terrelln@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      5d240522
  11. 10 6月, 2017 1 次提交
    • D
      x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations · 0aed55af
      Dan Williams 提交于
      The pmem driver has a need to transfer data with a persistent memory
      destination and be able to rely on the fact that the destination writes are not
      cached. It is sufficient for the writes to be flushed to a cpu-store-buffer
      (non-temporal / "movnt" in x86 terms), as we expect userspace to call fsync()
      to ensure data-writes have reached a power-fail-safe zone in the platform. The
      fsync() triggers a REQ_FUA or REQ_FLUSH to the pmem driver which will turn
      around and fence previous writes with an "sfence".
      
      Implement a __copy_from_user_inatomic_flushcache, memcpy_page_flushcache, and
      memcpy_flushcache, that guarantee that the destination buffer is not dirty in
      the cpu cache on completion. The new copy_from_iter_flushcache and sub-routines
      will be used to replace the "pmem api" (include/linux/pmem.h +
      arch/x86/include/asm/pmem.h). The availability of copy_from_iter_flushcache()
      and memcpy_flushcache() are gated by the CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
      config symbol, and fallback to copy_from_iter_nocache() and plain memcpy()
      otherwise.
      
      This is meant to satisfy the concern from Linus that if a driver wants to do
      something beyond the normal nocache semantics it should be something private to
      that driver [1], and Al's concern that anything uaccess related belongs with
      the rest of the uaccess code [2].
      
      The first consumer of this interface is a new 'copy_from_iter' dax operation so
      that pmem can inject cache maintenance operations without imposing this
      overhead on other dax-capable drivers.
      
      [1]: https://lists.01.org/pipermail/linux-nvdimm/2017-January/008364.html
      [2]: https://lists.01.org/pipermail/linux-nvdimm/2017-April/009942.html
      
      Cc: <x86@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Matthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      0aed55af
  12. 09 6月, 2017 1 次提交
  13. 27 2月, 2017 1 次提交
  14. 25 2月, 2017 2 次提交
  15. 24 2月, 2017 1 次提交
  16. 23 2月, 2017 1 次提交
  17. 04 2月, 2017 1 次提交
    • J
      lib: Introduce priority array area manager · 44091d29
      Jiri Pirko 提交于
      This introduces a infrastructure for management of linear priority
      areas. Priority order in an array matters, however order of items inside
      a priority group does not matter.
      
      As an initial implementation, L-sort algorithm is used. It is quite
      trivial. More advanced algorithm called P-sort will be introduced as a
      follow-up. The infrastructure is prepared for other algos.
      
      Alongside this, a testing module is introduced as well.
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44091d29
  18. 25 1月, 2017 2 次提交
  19. 27 12月, 2016 1 次提交
  20. 08 10月, 2016 1 次提交
    • V
      atomic64: no need for CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE · 51a02124
      Vineet Gupta 提交于
      This came to light when implementing native 64-bit atomics for ARCv2.
      
      The atomic64 self-test code uses CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
      to check whether atomic64_dec_if_positive() is available.  It seems it
      was needed when not every arch defined it.  However as of current code
      the Kconfig option seems needless
      
       - for CONFIG_GENERIC_ATOMIC64 it is auto-enabled in lib/Kconfig and a
         generic definition of API is present lib/atomic64.c
       - arches with native 64-bit atomics select it in arch/*/Kconfig and
         define the API in their headers
      
      So I see no point in keeping the Kconfig option
      
      Compile tested for:
       - blackfin (CONFIG_GENERIC_ATOMIC64)
       - x86 (!CONFIG_GENERIC_ATOMIC64)
       - ia64
      
      Link: http://lkml.kernel.org/r/1473703083-8625-3-git-send-email-vgupta@synopsys.comSigned-off-by: NVineet Gupta <vgupta@synopsys.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Zhaoxiu Zeng <zhaoxiu.zeng@gmail.com>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Ming Lin <ming.l@ssi.samsung.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Boqun Feng <boqun.feng@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      51a02124
  21. 17 9月, 2016 1 次提交
  22. 21 5月, 2016 1 次提交
  23. 16 4月, 2016 1 次提交
  24. 26 3月, 2016 1 次提交
    • A
      mm, kasan: stackdepot implementation. Enable stackdepot for SLAB · cd11016e
      Alexander Potapenko 提交于
      Implement the stack depot and provide CONFIG_STACKDEPOT.  Stack depot
      will allow KASAN store allocation/deallocation stack traces for memory
      chunks.  The stack traces are stored in a hash table and referenced by
      handles which reside in the kasan_alloc_meta and kasan_free_meta
      structures in the allocated memory chunks.
      
      IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
      duplication.
      
      Right now stackdepot support is only enabled in SLAB allocator.  Once
      KASAN features in SLAB are on par with those in SLUB we can switch SLUB
      to stackdepot as well, thus removing the dependency on SLUB stack
      bookkeeping, which wastes a lot of memory.
      
      This patch is based on the "mm: kasan: stack depots" patch originally
      prepared by Dmitry Chernenkov.
      
      Joonsoo has said that he plans to reuse the stackdepot code for the
      mm/page_owner.c debugging facility.
      
      [akpm@linux-foundation.org: s/depot_stack_handle/depot_stack_handle_t]
      [aryabinin@virtuozzo.com: comment style fixes]
      Signed-off-by: NAlexander Potapenko <glider@google.com>
      Signed-off-by: NAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <adech.fo@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Konstantin Serebryany <kcc@google.com>
      Cc: Dmitry Chernenkov <dmitryc@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cd11016e
  25. 18 1月, 2016 1 次提交
    • A
      lib: sw842: select crc32 · 5b571677
      Arnd Bergmann 提交于
      The sw842 library code was merged in linux-4.1 and causes a very rare randconfig
      failure when CONFIG_CRC32 is not set:
      
          lib/built-in.o: In function `sw842_compress':
          oid_registry.c:(.text+0x12ddc): undefined reference to `crc32_be'
          lib/built-in.o: In function `sw842_decompress':
          oid_registry.c:(.text+0x137e4): undefined reference to `crc32_be'
      
      This adds an explict 'select CRC32' statement, similar to what the other users
      of the crc32 code have. In practice, CRC32 is always enabled anyway because
      over 100 other symbols select it.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Fixes: 2da572c9 ("lib: add software 842 compression/decompression")
      Acked-by: NDan Streetman <ddstreet@ieee.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      5b571677
  26. 12 12月, 2015 1 次提交
  27. 08 12月, 2015 1 次提交
  28. 17 10月, 2015 1 次提交
  29. 28 8月, 2015 1 次提交
    • R
      nd_blk: change aperture mapping from WC to WB · 67a3e8fe
      Ross Zwisler 提交于
      This should result in a pretty sizeable performance gain for reads.  For
      rough comparison I did some simple read testing using PMEM to compare
      reads of write combining (WC) mappings vs write-back (WB).  This was
      done on a random lab machine.
      
      PMEM reads from a write combining mapping:
      	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000
      	100000+0 records in
      	100000+0 records out
      	409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s
      
      PMEM reads from a write-back mapping:
      	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000
      	1000000+0 records in
      	1000000+0 records out
      	4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s
      
      To be able to safely support a write-back aperture I needed to add
      support for the "read flush" _DSM flag, as outlined in the DSM spec:
      
      http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
      
      This flag tells the ND BLK driver that it needs to flush the cache lines
      associated with the aperture after the aperture is moved but before any
      new data is read.  This ensures that any stale cache lines from the
      previous contents of the aperture will be discarded from the processor
      cache, and the new data will be read properly from the DIMM.  We know
      that the cache lines are clean and will be discarded without any
      writeback because either a) the previous aperture operation was a read,
      and we never modified the contents of the aperture, or b) the previous
      aperture operation was a write and we must have written back the dirtied
      contents of the aperture to the DIMM before the I/O was completed.
      
      In order to add support for the "read flush" flag I needed to add a
      generic routine to invalidate cache lines, mmio_flush_range().  This is
      protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
      only supported on x86.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      67a3e8fe
  30. 25 8月, 2015 1 次提交
    • R
      lib: scatterlist: add sg splitting function · f8bcbe62
      Robert Jarzmik 提交于
      Sometimes a scatter-gather has to be split into several chunks, or sub
      scatter lists. This happens for example if a scatter list will be
      handled by multiple DMA channels, each one filling a part of it.
      
      A concrete example comes with the media V4L2 API, where the scatter list
      is allocated from userspace to hold an image, regardless of the
      knowledge of how many DMAs will fill it :
       - in a simple RGB565 case, one DMA will pump data from the camera ISP
         to memory
       - in the trickier YUV422 case, 3 DMAs will pump data from the camera
         ISP pipes, one for pipe Y, one for pipe U and one for pipe V
      
      For these cases, it is necessary to split the original scatter list into
      multiple scatter lists, which is the purpose of this patch.
      
      The guarantees that are required for this patch are :
       - the intersection of spans of any couple of resulting scatter lists is
         empty.
       - the union of spans of all resulting scatter lists is a subrange of
         the span of the original scatter list.
       - streaming DMA API operations (mapping, unmapping) should not happen
         both on both the resulting and the original scatter list. It's either
         the first or the later ones.
       - the caller is reponsible to call kfree() on the resulting
         scatterlists.
      Signed-off-by: NRobert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      f8bcbe62
  31. 21 8月, 2015 1 次提交
  32. 15 8月, 2015 1 次提交
  33. 26 6月, 2015 1 次提交