1. 12 4月, 2018 1 次提交
    • A
      proc: add seq_put_decimal_ull_width to speed up /proc/pid/smaps · d1be35cb
      Andrei Vagin 提交于
      seq_put_decimal_ull_w(m, str, val, width) prints a decimal number with a
      specified minimal field width.
      
      It is equivalent of seq_printf(m, "%s%*d", str, width, val), but it
      works much faster.
      
      == test_smaps.py
        num = 0
        with open("/proc/1/smaps") as f:
                for x in xrange(10000):
                        data = f.read()
                        f.seek(0, 0)
      ==
      
      == Before patch ==
        $ time python test_smaps.py
        real    0m4.593s
        user    0m0.398s
        sys     0m4.158s
      
      == After patch ==
        $ time python test_smaps.py
        real    0m3.828s
        user    0m0.413s
        sys     0m3.408s
      
      $ perf -g record python test_smaps.py
      == Before patch ==
      -   79.01%     3.36%  python   [kernel.kallsyms]    [k] show_smap.isra.33
         - 75.65% show_smap.isra.33
            + 48.85% seq_printf
            + 15.75% __walk_page_range
            + 9.70% show_map_vma.isra.23
              0.61% seq_puts
      
      == After patch ==
      -   75.51%     4.62%  python   [kernel.kallsyms]    [k] show_smap.isra.33
         - 70.88% show_smap.isra.33
            + 24.82% seq_put_decimal_ull_w
            + 19.78% __walk_page_range
            + 12.74% seq_printf
            + 11.08% show_map_vma.isra.23
            + 1.68% seq_puts
      
      [akpm@linux-foundation.org: fix drivers/of/unittest.c build]
      Link: http://lkml.kernel.org/r/20180212074931.7227-1-avagin@openvz.orgSigned-off-by: NAndrei Vagin <avagin@openvz.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d1be35cb
  2. 10 4月, 2018 1 次提交
    • L
      Fix subtle macro variable shadowing in min_not_zero() · e9092d0d
      Linus Torvalds 提交于
      Commit 3c8ba0d6 ("kernel.h: Retain constant expression output for
      max()/min()") rewrote our min/max macros to be very clever, but in the
      meantime resurrected a variable name shadow issue that we had had
      previously fixed in commit 589a9785 ("min/max: remove sparse
      warnings when they're nested").
      
      That commit talks about the sparse warnings that this shadowing causes,
      which we ignored as just a minor annoyance.  But it turns out that the
      sparse warning is the least of our problems.  We actually have a real
      bug due to the shadowing through the interaction with "min_not_zero()",
      which ends up doing
      
         min(__x, __y)
      
      internally, and then the new declaration of "__x" and "__y" as new
      variables in __cmp_once() results in a complete mess of an expression,
      and "min_not_zero()" doesn't work at all.
      
      For some odd reason, this only ever caused (reported) problems on s390,
      even though it is a generic issue and most of the (obviously successful)
      testing of the problematic commit had happened on other architectures.
      
      Quoting Sebastian Ott:
       "What happened is that the bio build by the partition detection code
        was attempted to be split by the block layer because the block queue
        had a max_sector setting of 0. blk_queue_max_hw_sectors uses
        min_not_zero."
      
      So re-introduce the use of __UNIQUE_ID() to make sure that the min/max
      macros do not have these kinds of clashes.
      
      [ That said, __UNIQUE_ID() itself has several issues that make it less
        than wonderful.
      
        In particular, the "uniqueness" has a fallback on the line number,
        which means that it's not actually unique in more complex cases if you
        don't build with gcc or clang (which have working unique counters that
        aren't tied to line numbers).
      
        That historical broken fallback also means that we have that pointless
        "prefix" argument that doesn't actually make much sense _except_ for
        the known-broken case. Oh well. ]
      
      Fixes: 3c8ba0d6 ("kernel.h: Retain constant expression output for max()/min()")
      Reported-and-tested-by: NSebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e9092d0d
  3. 06 4月, 2018 1 次提交
    • K
      kernel.h: Retain constant expression output for max()/min() · 3c8ba0d6
      Kees Cook 提交于
      In the effort to remove all VLAs from the kernel[1], it is desirable to
      build with -Wvla.  However, this warning is overly pessimistic, in that
      it is only happy with stack array sizes that are declared as constant
      expressions, and not constant values.  One case of this is the
      evaluation of the max() macro which, due to its construction, ends up
      converting constant expression arguments into a constant value result.
      
      All attempts to rewrite this macro with __builtin_constant_p() failed
      with older compilers (e.g.  gcc 4.4)[2].  However, Martin Uecker,
      constructed[3] a mind-shattering solution that works everywhere.
      Cthulhu fhtagn!
      
      This patch updates the min()/max() macros to evaluate to a constant
      expression when called on constant expression arguments.  This removes
      several false-positive stack VLA warnings from an x86 allmodconfig build
      when -Wvla is added:
      
        $ diff -u before.txt after.txt | grep ^-
        -drivers/input/touchscreen/cyttsp4_core.c:871:2: warning: ISO C90 forbids variable length array ‘ids’ [-Wvla]
        -fs/btrfs/tree-checker.c:344:4: warning: ISO C90 forbids variable length array ‘namebuf’ [-Wvla]
        -lib/vsprintf.c:747:2: warning: ISO C90 forbids variable length array ‘sym’ [-Wvla]
        -net/ipv4/proc.c:403:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
        -net/ipv6/proc.c:198:2: warning: ISO C90 forbids variable length array ‘buff’ [-Wvla]
        -net/ipv6/proc.c:218:2: warning: ISO C90 forbids variable length array ‘buff64’ [-Wvla]
      
      This also updates two cases where different enums were being compared
      and explicitly casts them to int (which matches the old side-effect of
      the single-evaluation code): one in tpm/tpm_tis_core.h, and one in
      drm/drm_color_mgmt.c.
      
       [1] https://lkml.org/lkml/2018/3/7/621
       [2] https://lkml.org/lkml/2018/3/10/170
       [3] https://lkml.org/lkml/2018/3/20/845Co-Developed-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Co-Developed-by: NMartin Uecker <Martin.Uecker@med.uni-goettingen.de>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Acked-by: NIngo Molnar <mingo@kernel.org>
      Acked-by: NMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3c8ba0d6
  4. 29 3月, 2018 1 次提交
  5. 21 2月, 2018 1 次提交
  6. 04 2月, 2018 1 次提交
  7. 18 11月, 2017 1 次提交
    • B
      kernel/panic.c: add TAINT_AUX · 4efb442c
      Borislav Petkov 提交于
      This is the gist of a patch which we've been forward-porting in our
      kernels for a long time now and it probably would make a good sense to
      have such TAINT_AUX flag upstream which can be used by each distro etc,
      how they see fit.  This way, we won't need to forward-port a distro-only
      version indefinitely.
      
      Add an auxiliary taint flag to be used by distros and others.  This
      obviates the need to forward-port whatever internal solutions people
      have in favor of a single flag which they can map arbitrarily to a
      definition of their pleasing.
      
      The "X" mnemonic could also mean eXternal, which would be taint from a
      distro or something else but not the upstream kernel.  We will use it to
      mark modules for which we don't provide support.  I.e., a really
      eXternal module.
      
      Link: http://lkml.kernel.org/r/20170911134533.dp5mtyku5bongx4c@pd.tnicSigned-off-by: NBorislav Petkov <bp@suse.de>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Jessica Yu <jeyu@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      4efb442c
  8. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  9. 14 10月, 2017 1 次提交
  10. 09 9月, 2017 1 次提交
  11. 17 8月, 2017 1 次提交
    • K
      locking/refcounts, x86/asm: Implement fast refcount overflow protection · 7a46ec0e
      Kees Cook 提交于
      This implements refcount_t overflow protection on x86 without a noticeable
      performance impact, though without the fuller checking of REFCOUNT_FULL.
      
      This is done by duplicating the existing atomic_t refcount implementation
      but with normally a single instruction added to detect if the refcount
      has gone negative (e.g. wrapped past INT_MAX or below zero). When detected,
      the handler saturates the refcount_t to INT_MIN / 2. With this overflow
      protection, the erroneous reference release that would follow a wrap back
      to zero is blocked from happening, avoiding the class of refcount-overflow
      use-after-free vulnerabilities entirely.
      
      Only the overflow case of refcounting can be perfectly protected, since
      it can be detected and stopped before the reference is freed and left to
      be abused by an attacker. There isn't a way to block early decrements,
      and while REFCOUNT_FULL stops increment-from-zero cases (which would
      be the state _after_ an early decrement and stops potential double-free
      conditions), this fast implementation does not, since it would require
      the more expensive cmpxchg loops. Since the overflow case is much more
      common (e.g. missing a "put" during an error path), this protection
      provides real-world protection. For example, the two public refcount
      overflow use-after-free exploits published in 2016 would have been
      rendered unexploitable:
      
        http://perception-point.io/2016/01/14/analysis-and-exploitation-of-a-linux-kernel-vulnerability-cve-2016-0728/
      
        http://cyseclabs.com/page?n=02012016
      
      This implementation does, however, notice an unchecked decrement to zero
      (i.e. caller used refcount_dec() instead of refcount_dec_and_test() and it
      resulted in a zero). Decrements under zero are noticed (since they will
      have resulted in a negative value), though this only indicates that a
      use-after-free may have already happened. Such notifications are likely
      avoidable by an attacker that has already exploited a use-after-free
      vulnerability, but it's better to have them reported than allow such
      conditions to remain universally silent.
      
      On first overflow detection, the refcount value is reset to INT_MIN / 2
      (which serves as a saturation value) and a report and stack trace are
      produced. When operations detect only negative value results (such as
      changing an already saturated value), saturation still happens but no
      notification is performed (since the value was already saturated).
      
      On the matter of races, since the entire range beyond INT_MAX but before
      0 is negative, every operation at INT_MIN / 2 will trap, leaving no
      overflow-only race condition.
      
      As for performance, this implementation adds a single "js" instruction
      to the regular execution flow of a copy of the standard atomic_t refcount
      operations. (The non-"and_test" refcount_dec() function, which is uncommon
      in regular refcount design patterns, has an additional "jz" instruction
      to detect reaching exactly zero.) Since this is a forward jump, it is by
      default the non-predicted path, which will be reinforced by dynamic branch
      prediction. The result is this protection having virtually no measurable
      change in performance over standard atomic_t operations. The error path,
      located in .text.unlikely, saves the refcount location and then uses UD0
      to fire a refcount exception handler, which resets the refcount, handles
      reporting, and returns to regular execution. This keeps the changes to
      .text size minimal, avoiding return jumps and open-coded calls to the
      error reporting routine.
      
      Example assembly comparison:
      
      refcount_inc() before:
      
        .text:
        ffffffff81546149:       f0 ff 45 f4             lock incl -0xc(%rbp)
      
      refcount_inc() after:
      
        .text:
        ffffffff81546149:       f0 ff 45 f4             lock incl -0xc(%rbp)
        ffffffff8154614d:       0f 88 80 d5 17 00       js     ffffffff816c36d3
        ...
        .text.unlikely:
        ffffffff816c36d3:       48 8d 4d f4             lea    -0xc(%rbp),%rcx
        ffffffff816c36d7:       0f ff                   (bad)
      
      These are the cycle counts comparing a loop of refcount_inc() from 1
      to INT_MAX and back down to 0 (via refcount_dec_and_test()), between
      unprotected refcount_t (atomic_t), fully protected REFCOUNT_FULL
      (refcount_t-full), and this overflow-protected refcount (refcount_t-fast):
      
        2147483646 refcount_inc()s and 2147483647 refcount_dec_and_test()s:
      		    cycles		protections
        atomic_t           82249267387	none
        refcount_t-fast    82211446892	overflow, untested dec-to-zero
        refcount_t-full   144814735193	overflow, untested dec-to-zero, inc-from-zero
      
      This code is a modified version of the x86 PAX_REFCOUNT atomic_t
      overflow defense from the last public patch of PaX/grsecurity, based
      on my understanding of the code. Changes or omissions from the original
      code are mine and don't reflect the original grsecurity/PaX code. Thanks
      to PaX Team for various suggestions for improvement for repurposing this
      code to be a refcount-only protection.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Eric Biggers <ebiggers3@gmail.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: Hans Liljestrand <ishkamiel@gmail.com>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Serge E. Hallyn <serge@hallyn.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arozansk@redhat.com
      Cc: axboe@kernel.dk
      Cc: kernel-hardening@lists.openwall.com
      Cc: linux-arch <linux-arch@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20170815161924.GA133115@beastSigned-off-by: NIngo Molnar <mingo@kernel.org>
      7a46ec0e
  12. 13 7月, 2017 1 次提交
    • I
      kernel.h: handle pointers to arrays better in container_of() · c7acec71
      Ian Abbott 提交于
      If the first parameter of container_of() is a pointer to a
      non-const-qualified array type (and the third parameter names a
      non-const-qualified array member), the local variable __mptr will be
      defined with a const-qualified array type.  In ISO C, these types are
      incompatible.  They work as expected in GNU C, but some versions will
      issue warnings.  For example, GCC 4.9 produces the warning
      "initialization from incompatible pointer type".
      
      Here is an example of where the problem occurs:
      
      -------------------------------------------------------
         #include <linux/kernel.h>
         #include <linux/module.h>
      
        MODULE_LICENSE("GPL");
      
        struct st {
        	int a;
        	char b[16];
        };
      
        static int __init example_init(void) {
        	struct st t = { .a = 101, .b = "hello" };
        	char (*p)[16] = &t.b;
        	struct st *x = container_of(p, struct st, b);
        	printk(KERN_DEBUG "%p %p\n", (void *)&t, (void *)x);
        	return 0;
        }
      
        static void __exit example_exit(void) {
        }
      
        module_init(example_init);
        module_exit(example_exit);
      -------------------------------------------------------
      
      Building the module with gcc-4.9 results in these warnings (where '{m}'
      is the module source and '{k}' is the kernel source):
      
      -------------------------------------------------------
        In file included from {m}/example.c:1:0:
        {m}/example.c: In function `example_init':
        {k}/include/linux/kernel.h:854:48: warning: initialization from incompatible pointer type
          const typeof( ((type *)0)->member ) *__mptr = (ptr); \
                                                        ^
        {m}/example.c:14:17: note: in expansion of macro `container_of'
          struct st *x = container_of(p, struct st, b);
                         ^
        {k}/include/linux/kernel.h:854:48: warning: (near initialization for `x')
          const typeof( ((type *)0)->member ) *__mptr = (ptr); \
                                                        ^
        {m}/example.c:14:17: note: in expansion of macro `container_of'
          struct st *x = container_of(p, struct st, b);
                         ^
      -------------------------------------------------------
      
      Replace the type checking performed by the macro to avoid these
      warnings.  Make sure `*(ptr)` either has type compatible with the
      member, or has type compatible with `void`, ignoring qualifiers.  Raise
      compiler errors if this is not true.  This is stronger than the previous
      behaviour, which only resulted in compiler warnings for a type mismatch.
      
      [arnd@arndb.de: fix new warnings for container_of()]
        Link: http://lkml.kernel.org/r/20170620200940.90557-1-arnd@arndb.de
      Link: http://lkml.kernel.org/r/20170525120316.24473-7-abbotti@mev.co.ukSigned-off-by: NIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NMichal Nazarewicz <mina86@mina86.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Johannes Berg <johannes.berg@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Alexander Potapenko <glider@google.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c7acec71
  13. 23 5月, 2017 1 次提交
  14. 21 4月, 2017 1 次提交
  15. 18 4月, 2017 1 次提交
    • B
      boot/param: Move next_arg() function to lib/cmdline.c for later reuse · f51b17c8
      Baoquan He 提交于
      next_arg() will be used to parse boot parameters in the x86/boot/compressed code,
      so move it to lib/cmdline.c for better code reuse.
      
      No change in functionality.
      Signed-off-by: NBaoquan He <bhe@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Jessica Yu <jeyu@redhat.com>
      Cc: Johannes Berg <johannes.berg@intel.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Larry Finger <Larry.Finger@lwfinger.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dan.j.williams@intel.com
      Cc: dave.jiang@intel.com
      Cc: dyoung@redhat.com
      Cc: keescook@chromium.org
      Cc: zijun_hu <zijun_hu@htc.com>
      Link: http://lkml.kernel.org/r/1492436099-4017-2-git-send-email-bhe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      f51b17c8
  16. 25 2月, 2017 1 次提交
  17. 18 1月, 2017 1 次提交
  18. 27 11月, 2016 1 次提交
    • P
      taint/module: Clean up global and module taint flags handling · 7fd8329b
      Petr Mladek 提交于
      The commit 66cc69e3 ("Fix: module signature vs tracepoints:
      add new TAINT_UNSIGNED_MODULE") updated module_taint_flags() to
      potentially print one more character. But it did not increase the
      size of the corresponding buffers in m_show() and print_modules().
      
      We have recently done the same mistake when adding a taint flag
      for livepatching, see
      https://lkml.kernel.org/r/cfba2c823bb984690b73572aaae1db596b54a082.1472137475.git.jpoimboe@redhat.com
      
      Also struct module uses an incompatible type for mod-taints flags.
      It survived from the commit 2bc2d61a ("[PATCH] list module
      taint flags in Oops/panic"). There was used "int" for the global taint
      flags at these times. But only the global tain flags was later changed
      to "unsigned long" by the commit 25ddbb18 ("Make the taint
      flags reliable").
      
      This patch defines TAINT_FLAGS_COUNT that can be used to create
      arrays and buffers of the right size. Note that we could not use
      enum because the taint flag indexes are used also in assembly code.
      
      Then it reworks the table that describes the taint flags. The TAINT_*
      numbers can be used as the index. Instead, we add information
      if the taint flag is also shown per-module.
      
      Finally, it uses "unsigned long", bit operations, and the updated
      taint_flags table also for mod->taints.
      
      It is not optimal because only few taint flags can be printed by
      module_taint_flags(). But better be on the safe side. IMHO, it is
      not worth the optimization and this is a good compromise.
      Signed-off-by: NPetr Mladek <pmladek@suse.com>
      Link: http://lkml.kernel.org/r/1474458442-21581-1-git-send-email-pmladek@suse.com
      [jeyu@redhat.com: fix broken lkml link in changelog]
      Signed-off-by: NJessica Yu <jeyu@redhat.com>
      7fd8329b
  19. 01 11月, 2016 1 次提交
  20. 20 10月, 2016 1 次提交
    • Z
      percpu: ensure the requested alignment is power of two · 3ca45a46
      zijun_hu 提交于
      The percpu allocator expectedly assumes that the requested alignment
      is power of two but hasn't been veryfing the input.  If the specified
      alignment isn't power of two, the allocator can malfunction.  Add the
      sanity check.
      
      The following is detailed analysis of the effects of alignments which
      aren't power of two.
      
       The alignment must be a even at least since the LSB of a chunk->map
       element is used as free/in-use flag of a area; besides, the alignment
       must be a power of 2 too since ALIGN() doesn't work well for other
       alignment always but is adopted by pcpu_fit_in_area().  IOW, the
       current allocator only works well for a power of 2 aligned area
       allocation.
      
       See below opposite example for why an odd alignment doesn't work.
       Let's assume area [16, 36) is free but its previous one is in-use, we
       want to allocate a @size == 8 and @align == 7 area.  The larger area
       [16, 36) is split to three areas [16, 21), [21, 29), [29, 36)
       eventually.  However, due to the usage for a chunk->map element, the
       actual offset of the aim area [21, 29) is 21 but is recorded in
       relevant element as 20; moreover, the residual tail free area [29,
       36) is mistook as in-use and is lost silently
      
       Unlike macro roundup(), ALIGN(x, a) doesn't work if @a isn't a power
       of 2 for example, roundup(10, 6) == 12 but ALIGN(10, 6) == 10, and
       the latter result isn't desired obviously.
      
      tj: Code style and patch description updates.
      Signed-off-by: Nzijun_hu <zijun_hu@htc.com>
      Suggested-by: NTejun Heo <tj@kernel.org>
      Signed-off-by: NTejun Heo <tj@kernel.org>
      3ca45a46
  21. 08 10月, 2016 1 次提交
  22. 22 9月, 2016 1 次提交
  23. 03 8月, 2016 1 次提交
  24. 16 6月, 2016 1 次提交
    • D
      rcu: sysctl: Panic on RCU Stall · 088e9d25
      Daniel Bristot de Oliveira 提交于
      It is not always easy to determine the cause of an RCU stall just by
      analysing the RCU stall messages, mainly when the problem is caused
      by the indirect starvation of rcu threads. For example, when preempt_rcu
      is not awakened due to the starvation of a timer softirq.
      
      We have been hard coding panic() in the RCU stall functions for
      some time while testing the kernel-rt. But this is not possible in
      some scenarios, like when supporting customers.
      
      This patch implements the sysctl kernel.panic_on_rcu_stall. If
      set to 1, the system will panic() when an RCU stall takes place,
      enabling the capture of a vmcore. The vmcore provides a way to analyze
      all kernel/tasks states, helping out to point to the culprit and the
      solution for the stall.
      
      The kernel.panic_on_rcu_stall sysctl is disabled by default.
      
      Changes from v1:
      - Fixed a typo in the git log
      - The if(sysctl_panic_on_rcu_stall) panic() is in a static function
      - Fixed the CONFIG_TINY_RCU compilation issue
      - The var sysctl_panic_on_rcu_stall is now __read_mostly
      
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
      Reviewed-by: NArnaldo Carvalho de Melo <acme@kernel.org>
      Tested-by: N"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
      Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      088e9d25
  25. 20 5月, 2016 1 次提交
    • R
      include/linux: apply __malloc attribute · 48a27055
      Rasmus Villemoes 提交于
      Attach the malloc attribute to a few allocation functions.  This helps
      gcc generate better code by telling it that the return value doesn't
      alias any existing pointers (which is even more valuable given the
      pessimizations implied by -fno-strict-aliasing).
      
      A simple example of what this allows gcc to do can be seen by looking at
      the last part of drm_atomic_helper_plane_reset:
      
      	plane->state = kzalloc(sizeof(*plane->state), GFP_KERNEL);
      
      	if (plane->state) {
      		plane->state->plane = plane;
      		plane->state->rotation = BIT(DRM_ROTATE_0);
      	}
      
      which compiles to
      
          e8 99 bf d6 ff          callq  ffffffff8116d540 <kmem_cache_alloc_trace>
          48 85 c0                test   %rax,%rax
          48 89 83 40 02 00 00    mov    %rax,0x240(%rbx)
          74 11                   je     ffffffff814015c4 <drm_atomic_helper_plane_reset+0x64>
          48 89 18                mov    %rbx,(%rax)
          48 8b 83 40 02 00 00    mov    0x240(%rbx),%rax [*]
          c7 40 40 01 00 00 00    movl   $0x1,0x40(%rax)
      
      With this patch applied, the instruction at [*] is elided, since the
      store to plane->state->plane is known to not alter the value of
      plane->state.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      48a27055
  26. 30 4月, 2016 1 次提交
  27. 23 3月, 2016 2 次提交
    • H
      panic: change nmi_panic from macro to function · ebc41f20
      Hidehiro Kawai 提交于
      Commit 1717f209 ("panic, x86: Fix re-entrance problem due to panic
      on NMI") and commit 58c5661f ("panic, x86: Allow CPUs to save
      registers even if looping in NMI context") introduced nmi_panic() which
      prevents concurrent/recursive execution of panic().  It also saves
      registers for the crash dump on x86.
      
      However, there are some cases where NMI handlers still use panic().
      This patch set partially replaces them with nmi_panic() in those cases.
      
      Even this patchset is applied, some NMI or similar handlers (e.g.  MCE
      handler) continue to use panic().  This is because I can't test them
      well and actual problems won't happen.  For example, the possibility
      that normal panic and panic on MCE happen simultaneously is very low.
      
      This patch (of 3):
      
      Convert nmi_panic() to a proper function and export it instead of
      exporting internal implementation details to modules, for obvious
      reasons.
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Acked-by: NMichal Nazarewicz <mina86@mina86.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Javi Merino <javi.merino@arm.com>
      Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
      Cc: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ebc41f20
    • S
      tracing: Fix trace_printk() to print when not using bprintk() · 3debb0a9
      Steven Rostedt (Red Hat) 提交于
      The trace_printk() code will allocate extra buffers if the compile detects
      that a trace_printk() is used. To do this, the format of the trace_printk()
      is saved to the __trace_printk_fmt section, and if that section is bigger
      than zero, the buffers are allocated (along with a message that this has
      happened).
      
      If trace_printk() uses a format that is not a constant, and thus something
      not guaranteed to be around when the print happens, the compiler optimizes
      the fmt out, as it is not used, and the __trace_printk_fmt section is not
      filled. This means the kernel will not allocate the special buffers needed
      for the trace_printk() and the trace_printk() will not write anything to the
      tracing buffer.
      
      Adding a "__used" to the variable in the __trace_printk_fmt section will
      keep it around, even though it is set to NULL. This will keep the string
      from being printed in the debugfs/tracing/printk_formats section as it is
      not needed.
      Reported-by: NVlastimil Babka <vbabka@suse.cz>
      Fixes: 07d777fe "tracing: Add percpu buffers for trace_printk()"
      Cc: stable@vger.kernel.org # v3.5+
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      3debb0a9
  28. 18 3月, 2016 1 次提交
    • K
      lib: move strtobool() to kstrtobool() · ef951599
      Kees Cook 提交于
      Create the kstrtobool_from_user() helper and move strtobool() logic into
      the new kstrtobool() (matching all the other kstrto* functions).
      Provides an inline wrapper for existing strtobool() callers.
      Signed-off-by: NKees Cook <keescook@chromium.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Amitkumar Karwar <akarwar@marvell.com>
      Cc: Nishant Sarmukadam <nishants@marvell.com>
      Cc: Kalle Valo <kvalo@codeaurora.org>
      Cc: Steve French <sfrench@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ef951599
  29. 05 3月, 2016 1 次提交
  30. 17 1月, 2016 1 次提交
    • M
      include/linux/kernel.h: change abs() macro so it uses consistent return type · 8f57e4d9
      Michal Nazarewicz 提交于
      Rewrite abs() so that its return type does not depend on the
      architecture and no unexpected type conversion happen inside of it.  The
      only conversion is from unsigned to signed type.  char is left as a
      return type but treated as a signed type regradless of it's actual
      signedness.
      
      With the old version, int arguments were promoted to long and depending
      on architecture a long argument might result in s64 or long return type
      (which may or may not be the same).
      
      This came after some back and forth with Nicolas.  The current macro has
      different return type (for the same input type) depending on
      architecture which might be midly iritating.
      
      An alternative version would promote to int like so:
      
      	#define abs(x)	__abs_choose_expr(x, long long,			\
      			__abs_choose_expr(x, long,			\
      			__builtin_choose_expr(				\
      				sizeof(x) <= sizeof(int),		\
      				({ int __x = (x); __x<0?-__x:__x; }),	\
      				((void)0))))
      
      I have no preference but imagine Linus might.  :] Nicolas argument against
      is that promoting to int causes iconsistent behaviour:
      
      	int main(void) {
      		unsigned short a = 0, b = 1, c = a - b;
      		unsigned short d = abs(a - b);
      		unsigned short e = abs(c);
      		printf("%u %u\n", d, e);  // prints: 1 65535
      	}
      
      Then again, no sane person expects consistent behaviour from C integer
      arithmetic.  ;)
      
      Note:
      
        __builtin_types_compatible_p(unsigned char, char) is always false, and
        __builtin_types_compatible_p(signed char, char) is also always false.
      Signed-off-by: NMichal Nazarewicz <mina86@mina86.com>
      Reviewed-by: NNicolas Pitre <nico@linaro.org>
      Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8f57e4d9
  31. 19 12月, 2015 2 次提交
    • H
      panic, x86: Allow CPUs to save registers even if looping in NMI context · 58c5661f
      Hidehiro Kawai 提交于
      Currently, kdump_nmi_shootdown_cpus(), a subroutine of crash_kexec(),
      sends an NMI IPI to CPUs which haven't called panic() to stop them,
      save their register information and do some cleanups for crash dumping.
      However, if such a CPU is infinitely looping in NMI context, we fail to
      save its register information into the crash dump.
      
      For example, this can happen when unknown NMIs are broadcast to all
      CPUs as follows:
      
        CPU 0                             CPU 1
        ===========================       ==========================
        receive an unknown NMI
        unknown_nmi_error()
          panic()                         receive an unknown NMI
            spin_trylock(&panic_lock)     unknown_nmi_error()
            crash_kexec()                   panic()
                                              spin_trylock(&panic_lock)
                                              panic_smp_self_stop()
                                                infinite loop
              kdump_nmi_shootdown_cpus()
                issue NMI IPI -----------> blocked until IRET
                                                infinite loop...
      
      Here, since CPU 1 is in NMI context, the second NMI from CPU 0 is
      blocked until CPU 1 executes IRET. However, CPU 1 never executes IRET,
      so the NMI is not handled and the callback function to save registers is
      never called.
      
      In practice, this can happen on some servers which broadcast NMIs to all
      CPUs when the NMI button is pushed.
      
      To save registers in this case, we need to:
      
        a) Return from NMI handler instead of looping infinitely
        or
        b) Call the callback function directly from the infinite loop
      
      Inherently, a) is risky because NMI is also used to prevent corrupted
      data from being propagated to devices.  So, we chose b).
      
      This patch does the following:
      
      1. Move the infinite looping of CPUs which haven't called panic() in NMI
         context (actually done by panic_smp_self_stop()) outside of panic() to
         enable us to refer pt_regs. Please note that panic_smp_self_stop() is
         still used for normal context.
      
      2. Call a callback of kdump_nmi_shootdown_cpus() directly to save
         registers and do some cleanups after setting waiting_for_crash_ipi which
         is used for counting down the number of CPUs which handled the callback
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Javi Merino <javi.merino@arm.com>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: kexec@lists.infradead.org
      Cc: linux-doc@vger.kernel.org
      Cc: lkml <linux-kernel@vger.kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Stefan Lippers-Hollmann <s.l-h@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Link: http://lkml.kernel.org/r/20151210014628.25437.75256.stgit@softrs
      [ Cleanup comments, fixup formatting. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      58c5661f
    • H
      panic, x86: Fix re-entrance problem due to panic on NMI · 1717f209
      Hidehiro Kawai 提交于
      If panic on NMI happens just after panic() on the same CPU, panic() is
      recursively called. Kernel stalls, as a result, after failing to acquire
      panic_lock.
      
      To avoid this problem, don't call panic() in NMI context if we've
      already entered panic().
      
      For that, introduce nmi_panic() macro to reduce code duplication. In
      the case of panic on NMI, don't return from NMI handlers if another CPU
      already panicked.
      Signed-off-by: NHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Aaron Tomlin <atomlin@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Gobinda Charan Maji <gobinda.cemk07@gmail.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Javi Merino <javi.merino@arm.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: kexec@lists.infradead.org
      Cc: linux-doc@vger.kernel.org
      Cc: lkml <linux-kernel@vger.kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Link: http://lkml.kernel.org/r/20151210014626.25437.13302.stgit@softrs
      [ Cleanup comments, fixup formatting. ]
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      1717f209
  32. 10 11月, 2015 2 次提交
    • A
      remove abs64() · 79211c8e
      Andrew Morton 提交于
      Switch everything to the new and more capable implementation of abs().
      Mainly to give the new abs() a bit of a workout.
      
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      79211c8e
    • M
      kernel.h: make abs() work with 64-bit types · c8299cb6
      Michal Nazarewicz 提交于
      For 64-bit arguments, the abs macro casts it to an int which leads to
      lost precision and may cause incorrect results.  To deal with 64-bit
      types abs64 macro has been introduced but still there are places where
      abs macro is used incorrectly.
      
      To deal with the problem, expand abs macro such that it operates on s64
      type when dealing with 64-bit types while still returning long when
      dealing with smaller types.
      
      This fixes one known bug (per John):
      
      The internal clocksteering done for fine-grained error correction uses a
      : logarithmic approximation, so any time adjtimex() adjusts the clock
      : steering, timekeeping_freqadjust() quickly approximates the correct clock
      : frequency over a series of ticks.
      :
      : Unfortunately, the logic in timekeeping_freqadjust(), introduced in commit
      : dc491596 (Rework frequency adjustments to work better w/ nohz),
      : used the abs() function with a s64 error value to calculate the size of
      : the approximated adjustment to be made.
      :
      : Per include/linux/kernel.h: "abs() should not be used for 64-bit types
      : (s64, u64, long long) - use abs64()".
      :
      : Thus on 32-bit platforms, this resulted in the clocksteering to take a
      : quite dampended random walk trying to converge on the proper frequency,
      : which caused the adjustments to be made much slower then intended (most
      : easily observed when large adjustments are made).
      Signed-off-by: NMichal Nazarewicz <mina86@mina86.com>
      Reported-by: NJohn Stultz <john.stultz@linaro.org>
      Tested-by: NJohn Stultz <john.stultz@linaro.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c8299cb6
  33. 07 11月, 2015 1 次提交
    • R
      lib/kasprintf.c: introduce kvasprintf_const · 0a9df786
      Rasmus Villemoes 提交于
      This adds kvasprintf_const which tries to use kstrdup_const if possible:
      If the format string contains no % characters, or if the format string is
      exactly "%s", we delegate to kstrdup_const.  Otherwise, we fall back to
      kvasprintf.
      
      Just as for kstrdup_const, the main motivation is to save memory by
      reusing .rodata when possible.
      
      The return value should be freed by kfree_const, just like for
      kstrdup_const.
      
      There is deliberately no kasprintf_const: In the vast majority of cases,
      the format string argument is a literal, so one can determine statically
      whether one could instead use kstrdup_const directly (which would also
      require one to change all corresponding kfree calls to kfree_const).
      Signed-off-by: NRasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Greg KH <greg@kroah.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0a9df786
  34. 18 7月, 2015 1 次提交
    • N
      include, lib: add __printf attributes to several function prototypes · 8db14860
      Nicolas Iooss 提交于
      Using __printf attributes helps to detect several format string issues
      at compile time (even though -Wformat-security is currently disabled in
      Makefile).  For example it can detect when formatting a pointer as a
      number, like the issue fixed in commit a3fa71c4 ("wl18xx: show
      rx_frames_per_rates as an array as it really is"), or when the arguments
      do not match the format string, c.f.  for example commit 5ce1aca8
      ("reiserfs: fix __RASSERT format string").
      
      To prevent similar bugs in the future, add a __printf attribute to every
      function prototype which needs one in include/linux/ and lib/.  These
      functions were mostly found by using gcc's -Wsuggest-attribute=format
      flag.
      Signed-off-by: NNicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Felipe Balbi <balbi@ti.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8db14860
  35. 01 7月, 2015 1 次提交
  36. 29 5月, 2015 1 次提交
    • S
      ring-buffer: Remove useless unused tracing_off_permanent() · 3c6296f7
      Steven Rostedt (Red Hat) 提交于
      The tracing_off_permanent() call is a way to disable all ring_buffers.
      Nothing uses it and nothing should use it, as tracing_off() and
      friends are better, as they disable the ring buffers related to
      tracing. The tracing_off_permanent() even disabled non tracing
      ring buffers. This is a bit drastic, and was added to handle NMIs
      doing outputs that could corrupt the ring buffer when only tracing
      used them. It is now obsolete and adds a little overhead, it should
      be removed.
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      3c6296f7
  37. 28 5月, 2015 1 次提交
    • G
      sysfs: tightened sysfs permission checks · 28b8d0c8
      Gobinda Charan Maji 提交于
      There were some inconsistency in restriction to VERIFY_OCTAL_PERMISSIONS().
      Previously the test was "User perms >= group perms >= other perms". The
      permission field of User, Group or Other consists of three bits. LSB is
      EXECUTE permission, MSB is READ permission and the middle bit is WRITE
      permission. But logically WRITE is "more privileged" than READ.
      
      Say for example, permission value is "0430". Here User has only READ
      permission whereas Group has both WRITE and EXECUTE permission.
      
      So, the checks could be tightened and the tests are separated to
      USER_READABLE >= GROUP_READABLE >= OTHER_READABLE,
      USER_WRITABLE >= GROUP_WRITABLE and OTHER_WRITABLE is not permitted.
      Signed-off-by: NGobinda Charan Maji <gobinda.cemk07@gmail.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      28b8d0c8