1. 12 12月, 2015 1 次提交
  2. 17 10月, 2015 1 次提交
  3. 28 8月, 2015 1 次提交
    • R
      nd_blk: change aperture mapping from WC to WB · 67a3e8fe
      Ross Zwisler 提交于
      This should result in a pretty sizeable performance gain for reads.  For
      rough comparison I did some simple read testing using PMEM to compare
      reads of write combining (WC) mappings vs write-back (WB).  This was
      done on a random lab machine.
      
      PMEM reads from a write combining mapping:
      	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000
      	100000+0 records in
      	100000+0 records out
      	409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s
      
      PMEM reads from a write-back mapping:
      	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000
      	1000000+0 records in
      	1000000+0 records out
      	4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s
      
      To be able to safely support a write-back aperture I needed to add
      support for the "read flush" _DSM flag, as outlined in the DSM spec:
      
      http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
      
      This flag tells the ND BLK driver that it needs to flush the cache lines
      associated with the aperture after the aperture is moved but before any
      new data is read.  This ensures that any stale cache lines from the
      previous contents of the aperture will be discarded from the processor
      cache, and the new data will be read properly from the DIMM.  We know
      that the cache lines are clean and will be discarded without any
      writeback because either a) the previous aperture operation was a read,
      and we never modified the contents of the aperture, or b) the previous
      aperture operation was a write and we must have written back the dirtied
      contents of the aperture to the DIMM before the I/O was completed.
      
      In order to add support for the "read flush" flag I needed to add a
      generic routine to invalidate cache lines, mmio_flush_range().  This is
      protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
      only supported on x86.
      Signed-off-by: NRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: NDan Williams <dan.j.williams@intel.com>
      67a3e8fe
  4. 25 8月, 2015 1 次提交
    • R
      lib: scatterlist: add sg splitting function · f8bcbe62
      Robert Jarzmik 提交于
      Sometimes a scatter-gather has to be split into several chunks, or sub
      scatter lists. This happens for example if a scatter list will be
      handled by multiple DMA channels, each one filling a part of it.
      
      A concrete example comes with the media V4L2 API, where the scatter list
      is allocated from userspace to hold an image, regardless of the
      knowledge of how many DMAs will fill it :
       - in a simple RGB565 case, one DMA will pump data from the camera ISP
         to memory
       - in the trickier YUV422 case, 3 DMAs will pump data from the camera
         ISP pipes, one for pipe Y, one for pipe U and one for pipe V
      
      For these cases, it is necessary to split the original scatter list into
      multiple scatter lists, which is the purpose of this patch.
      
      The guarantees that are required for this patch are :
       - the intersection of spans of any couple of resulting scatter lists is
         empty.
       - the union of spans of all resulting scatter lists is a subrange of
         the span of the original scatter list.
       - streaming DMA API operations (mapping, unmapping) should not happen
         both on both the resulting and the original scatter list. It's either
         the first or the later ones.
       - the caller is reponsible to call kfree() on the resulting
         scatterlists.
      Signed-off-by: NRobert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      f8bcbe62
  5. 21 8月, 2015 1 次提交
  6. 15 8月, 2015 1 次提交
  7. 26 6月, 2015 1 次提交
  8. 11 5月, 2015 1 次提交
    • D
      lib: add software 842 compression/decompression · 2da572c9
      Dan Streetman 提交于
      Add 842-format software compression and decompression functions.
      Update the MAINTAINERS 842 section to include the new files.
      
      The 842 compression function can compress any input data into the 842
      compression format.  The 842 decompression function can decompress any
      standard-format 842 compressed data - specifically, either a compressed
      data buffer created by the 842 software compression function, or a
      compressed data buffer created by the 842 hardware compressor (located
      in PowerPC coprocessors).
      
      The 842 compressed data format is explained in the header comments.
      
      This is used in a later patch to provide a full software 842 compression
      and decompression crypto interface.
      Signed-off-by: NDan Streetman <ddstreet@ieee.org>
      Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
      2da572c9
  9. 17 4月, 2015 1 次提交
  10. 10 3月, 2015 1 次提交
  11. 17 2月, 2015 1 次提交
  12. 07 1月, 2015 1 次提交
  13. 23 12月, 2014 1 次提交
  14. 14 9月, 2014 1 次提交
    • L
      Make ARCH_HAS_FAST_MULTIPLIER a real config variable · 72d93104
      Linus Torvalds 提交于
      It used to be an ad-hoc hack defined by the x86 version of
      <asm/bitops.h> that enabled a couple of library routines to know whether
      an integer multiply is faster than repeated shifts and additions.
      
      This just makes it use the real Kconfig system instead, and makes x86
      (which was the only architecture that did this) select the option.
      
      NOTE! Even for x86, this really is kind of wrong.  If we cared, we would
      probably not enable this for builds optimized for netburst (P4), where
      shifts-and-adds are generally faster than multiplies.  This patch does
      *not* change that kind of logic, though, it is purely a syntactic change
      with no code changes.
      
      This was triggered by the fact that we have other places that really
      want to know "do I want to expand multiples by constants by hand or
      not", particularly the hash generation code.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      72d93104
  15. 09 8月, 2014 1 次提交
  16. 07 8月, 2014 2 次提交
  17. 18 7月, 2014 1 次提交
  18. 05 5月, 2014 1 次提交
    • C
      lib: Export interval_tree · a88cc108
      Chris Wilson 提交于
      lib/interval_tree.c provides a simple interface for an interval-tree
      (an augmented red-black tree) but is only built when testing the generic
      macros for building interval-trees. For drivers with modest needs,
      export the simple interval-tree library as is.
      
      v2: Lots of help from Michel Lespinasse to only compile the code
          as required:
          - make INTERVAL_TREE a config option
          - make INTERVAL_TREE_TEST select the library functions
            and sanitize the filenames & Makefile
          - prepare interval_tree for being built as a module if required
      Signed-off-by: NChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NMichel Lespinasse <walken@google.com>
      [Acked for inclusion via drm/i915 by Andrew Morton.]
      [danvet: switch to _GPL as per the mailing list discussion.]
      Signed-off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
      a88cc108
  19. 08 4月, 2014 1 次提交
  20. 20 3月, 2014 1 次提交
  21. 30 11月, 2013 1 次提交
  22. 15 11月, 2013 1 次提交
  23. 13 11月, 2013 1 次提交
  24. 12 11月, 2013 1 次提交
    • D
      random32: add test cases for taus113 implementation · a6a9c0f1
      Daniel Borkmann 提交于
      We generated a battery of 100 test cases from GSL taus113 implemention
      and compare the results from a particular seed and a particular
      iteration with our implementation in the kernel. We have verified on
      32 and 64 bit machines that our taus113 kernel implementation gives
      same results as GSL taus113 implementation:
      
        [    0.147370] prandom: seed boundary self test passed
        [    0.148078] prandom: 100 self tests passed
      
      This is a Kconfig option that is disabled on default, just like the
      crc32 init selftests in order to not unnecessary slow down boot process.
      We also refactored out prandom_seed_very_weak() as it's now used in
      multiple places in order to reduce redundant code.
      
      GSL code we used for generating test cases:
      
        int i, j;
        srand(time(NULL));
        for (i = 0; i < 100; ++i) {
          int iteration = 500 + (rand() % 500);
          gsl_rng_default_seed = rand() + 1;
          gsl_rng *r = gsl_rng_alloc(gsl_rng_taus113);
          printf("\t{ %lu, ", gsl_rng_default_seed);
          for (j = 0; j < iteration - 1; ++j)
            gsl_rng_get(r);
          printf("%u, %lu },\n", iteration, gsl_rng_get(r));
          gsl_rng_free(r);
        }
      
      Joint work with Hannes Frederic Sowa.
      
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: NHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      a6a9c0f1
  25. 24 9月, 2013 1 次提交
    • D
      Add a generic associative array implementation. · 3cb98950
      David Howells 提交于
      Add a generic associative array implementation that can be used as the
      container for keyrings, thereby massively increasing the capacity available
      whilst also speeding up searching in keyrings that contain a lot of keys.
      
      This may also be useful in FS-Cache for tracking cookies.
      
      Documentation is added into Documentation/associative_array.txt
      
      Some of the properties of the implementation are:
      
       (1) Objects are opaque pointers.  The implementation does not care where they
           point (if anywhere) or what they point to (if anything).
      
           [!] NOTE: Pointers to objects _must_ be zero in the two least significant
           	       bits.
      
       (2) Objects do not need to contain linkage blocks for use by the array.  This
           permits an object to be located in multiple arrays simultaneously.
           Rather, the array is made up of metadata blocks that point to objects.
      
       (3) Objects are labelled as being one of two types (the type is a bool value).
           This information is stored in the array, but has no consequence to the
           array itself or its algorithms.
      
       (4) Objects require index keys to locate them within the array.
      
       (5) Index keys must be unique.  Inserting an object with the same key as one
           already in the array will replace the old object.
      
       (6) Index keys can be of any length and can be of different lengths.
      
       (7) Index keys should encode the length early on, before any variation due to
           length is seen.
      
       (8) Index keys can include a hash to scatter objects throughout the array.
      
       (9) The array can iterated over.  The objects will not necessarily come out in
           key order.
      
      (10) The array can be iterated whilst it is being modified, provided the RCU
           readlock is being held by the iterator.  Note, however, under these
           circumstances, some objects may be seen more than once.  If this is a
           problem, the iterator should lock against modification.  Objects will not
           be missed, however, unless deleted.
      
      (11) Objects in the array can be looked up by means of their index key.
      
      (12) Objects can be looked up whilst the array is being modified, provided the
           RCU readlock is being held by the thread doing the look up.
      
      The implementation uses a tree of 16-pointer nodes internally that are indexed
      on each level by nibbles from the index key.  To improve memory efficiency,
      shortcuts can be emplaced to skip over what would otherwise be a series of
      single-occupancy nodes.  Further, nodes pack leaf object pointers into spare
      space in the node rather than making an extra branch until as such time an
      object needs to be added to a full node.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      3cb98950
  26. 07 9月, 2013 1 次提交
  27. 03 9月, 2013 1 次提交
    • L
      lockref: implement lockless reference count updates using cmpxchg() · bc08b449
      Linus Torvalds 提交于
      Instead of taking the spinlock, the lockless versions atomically check
      that the lock is not taken, and do the reference count update using a
      cmpxchg() loop.  This is semantically identical to doing the reference
      count update protected by the lock, but avoids the "wait for lock"
      contention that you get when accesses to the reference count are
      contended.
      
      Note that a "lockref" is absolutely _not_ equivalent to an atomic_t.
      Even when the lockref reference counts are updated atomically with
      cmpxchg, the fact that they also verify the state of the spinlock means
      that the lockless updates can never happen while somebody else holds the
      spinlock.
      
      So while "lockref_put_or_lock()" looks a lot like just another name for
      "atomic_dec_and_lock()", and both optimize to lockless updates, they are
      fundamentally different: the decrement done by atomic_dec_and_lock() is
      truly independent of any lock (as long as it doesn't decrement to zero),
      so a locked region can still see the count change.
      
      The lockref structure, in contrast, really is a *locked* reference
      count.  If you hold the spinlock, the reference count will be stable and
      you can modify the reference count without using atomics, because even
      the lockless updates will see and respect the state of the lock.
      
      In order to enable the cmpxchg lockless code, the architecture needs to
      do three things:
      
       (1) Make sure that the "arch_spinlock_t" and an "unsigned int" can fit
           in an aligned u64, and have a "cmpxchg()" implementation that works
           on such a u64 data type.
      
       (2) define a helper function to test for a spinlock being unlocked
           ("arch_spin_value_unlocked()")
      
       (3) select the "ARCH_USE_CMPXCHG_LOCKREF" config variable in its
           Kconfig file.
      
      This enables it for x86-64 (but not 32-bit, we'd need to make sure
      cmpxchg() turns into the proper cmpxchg8b in order to enable it for
      32-bit mode).
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      bc08b449
  28. 24 7月, 2013 1 次提交
  29. 10 7月, 2013 2 次提交
    • C
      lib: add lz4 compressor module · c72ac7a1
      Chanho Min 提交于
      This patchset is for supporting LZ4 compression and the crypto API using
      it.
      
      As shown below, the size of data is a little bit bigger but compressing
      speed is faster under the enabled unaligned memory access.  We can use
      lz4 de/compression through crypto API as well.  Also, It will be useful
      for another potential user of lz4 compression.
      
      lz4 Compression Benchmark:
      Compiler: ARM gcc 4.6.4
      ARMv7, 1 GHz based board
         Kernel: linux 3.4
         Uncompressed data Size: 101 MB
               Compressed Size  compression Speed
         LZO   72.1MB		  32.1MB/s, 33.0MB/s(UA)
         LZ4   75.1MB		  30.4MB/s, 35.9MB/s(UA)
         LZ4HC 59.8MB		   2.4MB/s,  2.5MB/s(UA)
      - UA: Unaligned memory Access support
      - Latest patch set for LZO applied
      
      This patch:
      
      Add support for LZ4 compression in the Linux Kernel.  LZ4 Compression APIs
      for kernel are based on LZ4 implementation by Yann Collet and were changed
      for kernel coding style.
      
      LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
      LZ4 source repository : http://code.google.com/p/lz4/
      svn revision : r90
      
      Two APIs are added:
      
      lz4_compress() support basic lz4 compression whereas lz4hc_compress()
      support high compression or CPU performance get lower but compression
      ratio get higher.  Also, we require the pre-allocated working memory with
      the defined size and destination buffer must be allocated with the size of
      lz4_compressbound.
      
      [akpm@linux-foundation.org: make lz4_compresshcctx() static]
      Signed-off-by: NChanho Min <chanho.min@lge.com>
      Cc: "Darrick J. Wong" <djwong@us.ibm.com>
      Cc: Bob Pearson <rpearson@systemfabricworks.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Herbert Xu <herbert@gondor.hengli.com.au>
      Cc: Yann Collet <yann.collet.73@gmail.com>
      Cc: Kyungsik Lee <kyungsik.lee@lge.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c72ac7a1
    • K
      lib: add support for LZ4-compressed kernel · e76e1fdf
      Kyungsik Lee 提交于
      Add support for extracting LZ4-compressed kernel images, as well as
      LZ4-compressed ramdisk images in the kernel boot process.
      Signed-off-by: NKyungsik Lee <kyungsik.lee@lge.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Florian Fainelli <florian@openwrt.org>
      Cc: Yann Collet <yann.collet.73@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e76e1fdf
  30. 28 6月, 2013 1 次提交
    • G
      lib: Move fonts from drivers/video/console/ to lib/fonts/ · ee89bd6b
      Geert Uytterhoeven 提交于
      Several drivers need font support independent of CONFIG_VT, cfr. commit
      9cbce8d7e1dae0744ca4f68d62aa7de18196b6f4, "console/font: Refactor font
      support code selection logic").
      Hence move the fonts and their support logic from drivers/video/console/ to
      its own library directory lib/fonts/.
      This also allows to limit processing of drivers/video/console/Makefile to
      CONFIG_VT=y again.
      
      [Kevin Hilman <khilman@linaro.org>: Update arch/arm/boot/compressed/Makefile]
      Signed-off-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      ee89bd6b
  31. 06 6月, 2013 1 次提交
  32. 20 5月, 2013 1 次提交
  33. 16 4月, 2013 1 次提交
  34. 18 1月, 2013 1 次提交
  35. 18 12月, 2012 1 次提交
  36. 08 10月, 2012 1 次提交
    • D
      X.509: Implement simple static OID registry · a77ad6ea
      David Howells 提交于
      Implement a simple static OID registry that allows the mapping of an encoded
      OID to an enum value for ease of use.
      
      The OID registry index enum appears in the:
      
      	linux/oid_registry.h
      
      header file.  A script generates the registry from lines in the header file
      that look like:
      
      	<sp*>OID_foo,<sp*>/*<sp*>1.2.3.4<sp*>*/
      
      The actual OID is taken to be represented by the numbers with interpolated
      dots in the comment.
      
      All other lines in the header are ignored.
      
      The registry is queries by calling:
      
      	OID look_up_oid(const void *data, size_t datasize);
      
      This returns a number from the registry enum representing the OID if found or
      OID__NR if not.
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      a77ad6ea
  37. 31 7月, 2012 1 次提交
  38. 23 7月, 2012 1 次提交
    • D
      of/lib: Allow scripts/dtc/libfdt to be used from kernel code · ab253839
      David Daney 提交于
      libfdt is part of the device tree support in scripts/dtc/libfdt.  For
      some platforms that use the Device Tree, we want to be able to edit
      the flattened device tree form.
      
      We don't want to burden kernel builds that do not require it, so we
      gate compilation of libfdt files with CONFIG_LIBFDT.  So if it is
      needed, you need to do this in your Kconfig:
      
      	select LIBFDT
      
      And in the Makefile of the code using libfdt something like:
      
      ccflags-y := -I$(src)/../../../scripts/dtc/libfdt
      Signed-off-by: NDavid Daney <david.daney@cavium.com>
      Cc: linux-mips@linux-mips.org
      Cc: devicetree-discuss@lists.ozlabs.org
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Cc: linux-kernel@vger.kernel.org
      Acked-by: NRob Herring <rob.herring@calxeda.com>
      Signed-off-by: NRalf Baechle <ralf@linux-mips.org>
      ab253839