1. 14 6月, 2018 2 次提交
  2. 13 6月, 2018 1 次提交
    • S
      alpha: Remove custom dec_and_lock() implementation · f2ae6794
      Sebastian Andrzej Siewior 提交于
      Alpha provides a custom implementation of dec_and_lock(). The functions
      is split into two parts:
      - atomic_add_unless() + return 0 (fast path in assembly)
      - remaining part including locking (slow path in C)
      
      Comparing the result of the alpha implementation with the generic
      implementation compiled by gcc it looks like the fast path is optimized
      by avoiding a stack frame (and reloading the GP), register store and all
      this. This is only done in the slowpath.
      After marking the slowpath (atomic_dec_and_lock_1()) as "noinline" and
      doing the slowpath in C (the atomic_add_unless(atomic, -1, 1) part) I
      noticed differences in the resulting assembly:
      - the GP is still reloaded
      - atomic_add_unless() adds more memory barriers compared to the custom
        assembly
      - the custom assembly here does "load, sub, beq" while
        atomic_add_unless() does "load, cmpeq, add, bne". This is okay because
        it compares against zero after subtraction while the generic code
        compares against 1 before.
      
      I'm not sure if avoiding the stack frame (and GP reloading) brings a lot
      in terms of performance. Regarding the different barriers, Peter
      Zijlstra says:
      
      |refcount decrement needs to be a RELEASE operation, such that all the
      |load/stores to the object happen before we decrement the refcount.
      |
      |Otherwise things like:
      |
      |        obj->foo = 5;
      |        refcnt_dec(&obj->ref);
      |
      |can be re-ordered, which then allows fun scenarios like:
      |
      |        CPU0                            CPU1
      |
      |        refcnt_dec(&obj->ref);
      |                                        if (dec_and_test(&obj->ref))
      |                                                free(obj);
      |        obj->foo = 5; // oops UaF
      |
      |
      |This means (for alpha) that there should be a memory barrier _before_
      |the decrement, however the dec_and_lock asm thing only has one _after_,
      |which, per the above, is too late.
      |
      |The generic version using add_unless will result in memory barrier
      |before and after (because that is the rule for atomic ops with a return
      |value) which is strictly too many barriers for the refcount story, but
      |who knows what other ordering requirements code has.
      
      Remove the custom alpha implementation of dec_and_lock() and if it is an
      issue (performance wise) then the fast path could still be inlined.
      Signed-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: linux-alpha@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180606115918.GG12198@hirez.programming.kicks-ass.net
      Link: https://lkml.kernel.org/r20180612161621.22645-2-bigeasy@linutronix.de
      f2ae6794
  3. 06 6月, 2018 1 次提交
  4. 19 5月, 2018 1 次提交
  5. 09 5月, 2018 1 次提交
  6. 23 4月, 2018 1 次提交
  7. 12 4月, 2018 2 次提交
  8. 22 3月, 2018 1 次提交
  9. 15 3月, 2018 1 次提交
  10. 07 2月, 2018 1 次提交
  11. 15 1月, 2018 1 次提交
  12. 13 1月, 2018 1 次提交
    • M
      error-injection: Separate error-injection from kprobe · 540adea3
      Masami Hiramatsu 提交于
      Since error-injection framework is not limited to be used
      by kprobes, nor bpf. Other kernel subsystems can use it
      freely for checking safeness of error-injection, e.g.
      livepatch, ftrace etc.
      So this separate error-injection framework from kprobes.
      
      Some differences has been made:
      
      - "kprobe" word is removed from any APIs/structures.
      - BPF_ALLOW_ERROR_INJECTION() is renamed to
        ALLOW_ERROR_INJECTION() since it is not limited for BPF too.
      - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this
        feature. It is automatically enabled if the arch supports
        error injection feature for kprobe or ftrace etc.
      Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: NJosef Bacik <jbacik@fb.com>
      Signed-off-by: NAlexei Starovoitov <ast@kernel.org>
      540adea3
  13. 11 12月, 2017 1 次提交
  14. 18 11月, 2017 2 次提交
  15. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  16. 26 9月, 2017 1 次提交
  17. 09 9月, 2017 1 次提交
  18. 16 8月, 2017 2 次提交
    • N
      lib: Add zstd modules · 73f3d1b4
      Nick Terrell 提交于
      Add zstd compression and decompression kernel modules.
      zstd offers a wide varity of compression speed and quality trade-offs.
      It can compress at speeds approaching lz4, and quality approaching lzma.
      zstd decompressions at speeds more than twice as fast as zlib, and
      decompression speed remains roughly the same across all compression levels.
      
      The code was ported from the upstream zstd source repository. The
      `linux/zstd.h` header was modified to match linux kernel style.
      The cross-platform and allocation code was stripped out. Instead zstd
      requires the caller to pass a preallocated workspace. The source files
      were clang-formatted [1] to match the Linux Kernel style as much as
      possible. Otherwise, the code was unmodified. We would like to avoid
      as much further manual modification to the source code as possible, so it
      will be easier to keep the kernel zstd up to date.
      
      I benchmarked zstd compression as a special character device. I ran zstd
      and zlib compression at several levels, as well as performing no
      compression, which measure the time spent copying the data to kernel space.
      Data is passed to the compresser 4096 B at a time. The benchmark file is
      located in the upstream zstd source repository under
      `contrib/linux-kernel/zstd_compress_test.c` [2].
      
      I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
      The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
      16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is
      211,988,480 B large. Run the following commands for the benchmark:
      
          sudo modprobe zstd_compress_test
          sudo mknod zstd_compress_test c 245 0
          sudo cp silesia.tar zstd_compress_test
      
      The time is reported by the time of the userland `cp`.
      The MB/s is computed with
      
          1,536,217,008 B / time(buffer size, hash)
      
      which includes the time to copy from userland.
      The Adjusted MB/s is computed with
      
          1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
      
      The memory reported is the amount of memory the compressor requests.
      
      | Method   | Size (B) | Time (s) | Ratio | MB/s    | Adj MB/s | Mem (MB) |
      |----------|----------|----------|-------|---------|----------|----------|
      | none     | 11988480 |    0.100 |     1 | 2119.88 |        - |        - |
      | zstd -1  | 73645762 |    1.044 | 2.878 |  203.05 |   224.56 |     1.23 |
      | zstd -3  | 66988878 |    1.761 | 3.165 |  120.38 |   127.63 |     2.47 |
      | zstd -5  | 65001259 |    2.563 | 3.261 |   82.71 |    86.07 |     2.86 |
      | zstd -10 | 60165346 |   13.242 | 3.523 |   16.01 |    16.13 |    13.22 |
      | zstd -15 | 58009756 |   47.601 | 3.654 |    4.45 |     4.46 |    21.61 |
      | zstd -19 | 54014593 |  102.835 | 3.925 |    2.06 |     2.06 |    60.15 |
      | zlib -1  | 77260026 |    2.895 | 2.744 |   73.23 |    75.85 |     0.27 |
      | zlib -3  | 72972206 |    4.116 | 2.905 |   51.50 |    52.79 |     0.27 |
      | zlib -6  | 68190360 |    9.633 | 3.109 |   22.01 |    22.24 |     0.27 |
      | zlib -9  | 67613382 |   22.554 | 3.135 |    9.40 |     9.44 |     0.27 |
      
      I benchmarked zstd decompression using the same method on the same machine.
      The benchmark file is located in the upstream zstd repo under
      `contrib/linux-kernel/zstd_decompress_test.c` [4]. The memory reported is
      the amount of memory required to decompress data compressed with the given
      compression level. If you know the maximum size of your input, you can
      reduce the memory usage of decompression irrespective of the compression
      level.
      
      | Method   | Time (s) | MB/s    | Adjusted MB/s | Memory (MB) |
      |----------|----------|---------|---------------|-------------|
      | none     |    0.025 | 8479.54 |             - |           - |
      | zstd -1  |    0.358 |  592.15 |        636.60 |        0.84 |
      | zstd -3  |    0.396 |  535.32 |        571.40 |        1.46 |
      | zstd -5  |    0.396 |  535.32 |        571.40 |        1.46 |
      | zstd -10 |    0.374 |  566.81 |        607.42 |        2.51 |
      | zstd -15 |    0.379 |  559.34 |        598.84 |        4.61 |
      | zstd -19 |    0.412 |  514.54 |        547.77 |        8.80 |
      | zlib -1  |    0.940 |  225.52 |        231.68 |        0.04 |
      | zlib -3  |    0.883 |  240.08 |        247.07 |        0.04 |
      | zlib -6  |    0.844 |  251.17 |        258.84 |        0.04 |
      | zlib -9  |    0.837 |  253.27 |        287.64 |        0.04 |
      
      Tested in userland using the test-suite in the zstd repo under
      `contrib/linux-kernel/test/UserlandTest.cpp` [5] by mocking the kernel
      functions. Fuzz tested using libfuzzer [6] with the fuzz harnesses under
      `contrib/linux-kernel/test/{RoundTripCrash.c,DecompressCrash.c}` [7] [8]
      with ASAN, UBSAN, and MSAN. Additionaly, it was tested while testing the
      BtrFS and SquashFS patches coming next.
      
      [1] https://clang.llvm.org/docs/ClangFormat.html
      [2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_compress_test.c
      [3] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
      [4] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/zstd_decompress_test.c
      [5] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/UserlandTest.cpp
      [6] http://llvm.org/docs/LibFuzzer.html
      [7] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/RoundTripCrash.c
      [8] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/DecompressCrash.c
      
      zstd source repository: https://github.com/facebook/zstdSigned-off-by: NNick Terrell <terrelln@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      73f3d1b4
    • N
      lib: Add xxhash module · 5d240522
      Nick Terrell 提交于
      Adds xxhash kernel module with xxh32 and xxh64 hashes. xxhash is an
      extremely fast non-cryptographic hash algorithm for checksumming.
      The zstd compression and decompression modules added in the next patch
      require xxhash. I extracted it out from zstd since it is useful on its
      own. I copied the code from the upstream XXHash source repository and
      translated it into kernel style. I ran benchmarks and tests in the kernel
      and tests in userland.
      
      I benchmarked xxhash as a special character device. I ran in four modes,
      no-op, xxh32, xxh64, and crc32. The no-op mode simply copies the data to
      kernel space and ignores it. The xxh32, xxh64, and crc32 modes compute
      hashes on the copied data. I also ran it with four different buffer sizes.
      The benchmark file is located in the upstream zstd source repository under
      `contrib/linux-kernel/xxhash_test.c` [1].
      
      I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
      The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
      16 GB of RAM, and a SSD. I benchmarked using the file `filesystem.squashfs`
      from `ubuntu-16.10-desktop-amd64.iso`, which is 1,536,217,088 B large.
      Run the following commands for the benchmark:
      
          modprobe xxhash_test
          mknod xxhash_test c 245 0
          time cp filesystem.squashfs xxhash_test
      
      The time is reported by the time of the userland `cp`.
      The GB/s is computed with
      
          1,536,217,008 B / time(buffer size, hash)
      
      which includes the time to copy from userland.
      The Normalized GB/s is computed with
      
          1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
      
      | Buffer Size (B) | Hash  | Time (s) | GB/s | Adjusted GB/s |
      |-----------------|-------|----------|------|---------------|
      |            1024 | none  |    0.408 | 3.77 |             - |
      |            1024 | xxh32 |    0.649 | 2.37 |          6.37 |
      |            1024 | xxh64 |    0.542 | 2.83 |         11.46 |
      |            1024 | crc32 |    1.290 | 1.19 |          1.74 |
      |            4096 | none  |    0.380 | 4.04 |             - |
      |            4096 | xxh32 |    0.645 | 2.38 |          5.79 |
      |            4096 | xxh64 |    0.500 | 3.07 |         12.80 |
      |            4096 | crc32 |    1.168 | 1.32 |          1.95 |
      |            8192 | none  |    0.351 | 4.38 |             - |
      |            8192 | xxh32 |    0.614 | 2.50 |          5.84 |
      |            8192 | xxh64 |    0.464 | 3.31 |         13.60 |
      |            8192 | crc32 |    1.163 | 1.32 |          1.89 |
      |           16384 | none  |    0.346 | 4.43 |             - |
      |           16384 | xxh32 |    0.590 | 2.60 |          6.30 |
      |           16384 | xxh64 |    0.466 | 3.30 |         12.80 |
      |           16384 | crc32 |    1.183 | 1.30 |          1.84 |
      
      Tested in userland using the test-suite in the zstd repo under
      `contrib/linux-kernel/test/XXHashUserlandTest.cpp` [2] by mocking the
      kernel functions. A line in each branch of every function in `xxhash.c`
      was commented out to ensure that the test-suite fails. Additionally
      tested while testing zstd and with SMHasher [3].
      
      [1] https://phabricator.intern.facebook.com/P57526246
      [2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/test/XXHashUserlandTest.cpp
      [3] https://github.com/aappleby/smhasher
      
      zstd source repository: https://github.com/facebook/zstd
      XXHash source repository: https://github.com/cyan4973/xxhashSigned-off-by: NNick Terrell <terrelln@fb.com>
      Signed-off-by: NChris Mason <clm@fb.com>
      5d240522
  19. 15 7月, 2017 1 次提交
    • L
      kmod: add test driver to stress test the module loader · d9c6a72d
      Luis R. Rodriguez 提交于
      This adds a new stress test driver for kmod: the kernel module loader.
      The new stress test driver, test_kmod, is only enabled as a module right
      now.  It should be possible to load this as built-in and load tests
      early (refer to the force_init_test module parameter), however since a
      lot of test can get a system out of memory fast we leave this disabled
      for now.
      
      Using a system with 1024 MiB of RAM can *easily* get your kernel OOM
      fast with this test driver.
      
      The test_kmod driver exposes API knobs for us to fine tune simple
      request_module() and get_fs_type() calls.  Since these API calls only
      allow each one parameter a test driver for these is rather simple.
      Other factors that can help out test driver though are the number of
      calls we issue and knowing current limitations of each.  This exposes
      configuration as much as possible through userspace to be able to build
      tests directly from userspace.
      
      Since it allows multiple misc devices its will eventually (once we add a
      knob to let us create new devices at will) also be possible to perform
      more tests in parallel, provided you have enough memory.
      
      We only enable tests we know work as of right now.
      
      Demo screenshots:
      
       # tools/testing/selftests/kmod/kmod.sh
      kmod_test_0001_driver: OK! - loading kmod test
      kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
      kmod_test_0001_fs: OK! - loading kmod test
      kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
      kmod_test_0002_driver: OK! - loading kmod test
      kmod_test_0002_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
      kmod_test_0002_fs: OK! - loading kmod test
      kmod_test_0002_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
      kmod_test_0003: OK! - loading kmod test
      kmod_test_0003: OK! - Return value: 0 (SUCCESS), expected SUCCESS
      kmod_test_0004: OK! - loading kmod test
      kmod_test_0004: OK! - Return value: 0 (SUCCESS), expected SUCCESS
      kmod_test_0005: OK! - loading kmod test
      kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
      kmod_test_0006: OK! - loading kmod test
      kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
      kmod_test_0005: OK! - loading kmod test
      kmod_test_0005: OK! - Return value: 0 (SUCCESS), expected SUCCESS
      kmod_test_0006: OK! - loading kmod test
      kmod_test_0006: OK! - Return value: 0 (SUCCESS), expected SUCCESS
      XXX: add test restult for 0007
      Test completed
      
      You can also request for specific tests:
      
       # tools/testing/selftests/kmod/kmod.sh -t 0001
      kmod_test_0001_driver: OK! - loading kmod test
      kmod_test_0001_driver: OK! - Return value: 256 (MODULE_NOT_FOUND), expected MODULE_NOT_FOUND
      kmod_test_0001_fs: OK! - loading kmod test
      kmod_test_0001_fs: OK! - Return value: -22 (-EINVAL), expected -EINVAL
      Test completed
      
      Lastly, the current available number of tests:
      
       # tools/testing/selftests/kmod/kmod.sh --help
      Usage: tools/testing/selftests/kmod/kmod.sh [ -t <4-number-digit> ]
      Valid tests: 0001-0009
      
      0001 - Simple test - 1 thread  for empty string
      0002 - Simple test - 1 thread  for modules/filesystems that do not exist
      0003 - Simple test - 1 thread  for get_fs_type() only
      0004 - Simple test - 2 threads for get_fs_type() only
      0005 - multithreaded tests with default setup - request_module() only
      0006 - multithreaded tests with default setup - get_fs_type() only
      0007 - multithreaded tests with default setup test request_module() and get_fs_type()
      0008 - multithreaded - push kmod_concurrent over max_modprobes for request_module()
      0009 - multithreaded - push kmod_concurrent over max_modprobes for get_fs_type()
      
      The following test cases currently fail, as such they are not currently
      enabled by default:
      
       # tools/testing/selftests/kmod/kmod.sh -t 0008
       # tools/testing/selftests/kmod/kmod.sh -t 0009
      
      To be sure to run them as intended please unload both of the modules:
      
        o test_module
        o xfs
      
      And ensure they are not loaded on your system prior to testing them.  If
      you use these paritions for your rootfs you can change the default test
      driver used for get_fs_type() by exporting it into your environment.  For
      example of other test defaults you can override refer to kmod.sh
      allow_user_defaults().
      
      Behind the scenes this is how we fine tune at a test case prior to
      hitting a trigger to run it:
      
      cat /sys/devices/virtual/misc/test_kmod0/config
      echo -n "2" > /sys/devices/virtual/misc/test_kmod0/config_test_case
      echo -n "ext4" > /sys/devices/virtual/misc/test_kmod0/config_test_fs
      echo -n "80" > /sys/devices/virtual/misc/test_kmod0/config_num_threads
      cat /sys/devices/virtual/misc/test_kmod0/config
      echo -n "1" > /sys/devices/virtual/misc/test_kmod0/config_num_threads
      
      Finally to trigger:
      
      echo -n "1" > /sys/devices/virtual/misc/test_kmod0/trigger_config
      
      The kmod.sh script uses the above constructs to build different test cases.
      
      A bit of interpretation of the current failures follows, first two
      premises:
      
      a) When request_module() is used userspace figures out an optimized
         version of module order for us.  Once it finds the modules it needs, as
         per depmod symbol dep map, it will finit_module() the respective
         modules which are needed for the original request_module() request.
      
      b) We have an optimization in place whereby if a kernel uses
         request_module() on a module already loaded we never bother userspace
         as the module already is loaded.  This is all handled by kernel/kmod.c.
      
      A few things to consider to help identify root causes of issues:
      
      0) kmod 19 has a broken heuristic for modules being assumed to be
         built-in to your kernel and will return 0 even though request_module()
         failed.  Upgrade to a newer version of kmod.
      
      1) A get_fs_type() call for "xfs" will request_module() for "fs-xfs",
         not for "xfs".  The optimization in kernel described in b) fails to
         catch if we have a lot of consecutive get_fs_type() calls.  The reason
         is the optimization in place does not look for aliases.  This means two
         consecutive get_fs_type() calls will bump kmod_concurrent, whereas
         request_module() will not.
      
      This one explanation why test case 0009 fails at least once for
      get_fs_type().
      
      2) If a module fails to load --- for whatever reason (kmod_concurrent
         limit reached, file not yet present due to rootfs switch, out of
         memory) we have a period of time during which module request for the
         same name either with request_module() or get_fs_type() will *also*
         fail to load even if the file for the module is ready.
      
      This explains why *multiple* NULLs are possible on test 0009.
      
      3) finit_module() consumes quite a bit of memory.
      
      4) Filesystems typically also have more dependent modules than other
         modules, its important to note though that even though a get_fs_type()
         call does not incur additional kmod_concurrent bumps, since userspace
         loads dependencies it finds it needs via finit_module_fd(), it *will*
         take much more memory to load a module with a lot of dependencies.
      
      Because of 3) and 4) we will easily run into out of memory failures with
      certain tests.  For instance test 0006 fails on qemu with 1024 MiB of RAM.
      It panics a box after reaping all userspace processes and still not
      having enough memory to reap.
      
      [arnd@arndb.de: add dependencies for test module]
        Link: http://lkml.kernel.org/r/20170630154834.3689272-1-arnd@arndb.de
      Link: http://lkml.kernel.org/r/20170628223155.26472-3-mcgrof@kernel.orgSigned-off-by: NLuis R. Rodriguez <mcgrof@kernel.org>
      Cc: Jessica Yu <jeyu@redhat.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Michal Marek <mmarek@suse.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d9c6a72d
  20. 13 7月, 2017 1 次提交
  21. 06 7月, 2017 1 次提交
    • J
      lib: add errseq_t type and infrastructure for handling it · 84cbadad
      Jeff Layton 提交于
      An errseq_t is a way of recording errors in one place, and allowing any
      number of "subscribers" to tell whether an error has been set again
      since a previous time.
      
      It's implemented as an unsigned 32-bit value that is managed with atomic
      operations. The low order bits are designated to hold an error code
      (max size of MAX_ERRNO). The upper bits are used as a counter.
      
      The API works with consumers sampling an errseq_t value at a particular
      point in time. Later, that value can be used to tell whether new errors
      have been set since that time.
      
      Note that there is a 1 in 512k risk of collisions here if new errors
      are being recorded frequently, since we have so few bits to use as a
      counter. To mitigate this, one bit is used as a flag to tell whether the
      value has been sampled since a new value was recorded. That allows
      us to avoid bumping the counter if no one has sampled it since it
      was last bumped.
      
      Later patches will build on this infrastructure to change how writeback
      errors are tracked in the kernel.
      Signed-off-by: NJeff Layton <jlayton@redhat.com>
      Reviewed-by: NNeilBrown <neilb@suse.com>
      Reviewed-by: NJan Kara <jack@suse.cz>
      84cbadad
  22. 09 6月, 2017 2 次提交
  23. 09 5月, 2017 1 次提交
  24. 27 4月, 2017 1 次提交
  25. 29 3月, 2017 1 次提交
  26. 24 3月, 2017 1 次提交
  27. 25 2月, 2017 3 次提交
  28. 24 2月, 2017 1 次提交
  29. 14 2月, 2017 2 次提交
  30. 04 2月, 2017 1 次提交
    • J
      lib: Introduce priority array area manager · 44091d29
      Jiri Pirko 提交于
      This introduces a infrastructure for management of linear priority
      areas. Priority order in an array matters, however order of items inside
      a priority group does not matter.
      
      As an initial implementation, L-sort algorithm is used. It is quite
      trivial. More advanced algorithm called P-sort will be introduced as a
      follow-up. The infrastructure is prepared for other algos.
      
      Alongside this, a testing module is introduced as well.
      Signed-off-by: NJiri Pirko <jiri@mellanox.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      44091d29
  31. 03 2月, 2017 1 次提交
    • J
      ext4: move halfmd4 into hash.c directly · 1c83a9aa
      Jason A. Donenfeld 提交于
      The "half md4" transform should not be used by any new code. And
      fortunately, it's only used now by ext4. Since ext4 supports several
      hashing methods, at some point it might be desirable to move to
      something like SipHash. As an intermediate step, remove half md4 from
      cryptohash.h and lib, and make it just a local function in ext4's
      hash.c. There's precedent for doing this; the other function ext can use
      for its hashes -- TEA -- is also implemented in the same place. Also, by
      being a local function, this might allow gcc to perform some additional
      optimizations.
      Signed-off-by: NJason A. Donenfeld <Jason@zx2c4.com>
      Reviewed-by: NAndreas Dilger <adilger@dilger.ca>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: NTheodore Ts'o <tytso@mit.edu>
      1c83a9aa
  32. 25 1月, 2017 1 次提交