1. 10 8月, 2016 5 次提交
    • W
      sched/deadline: Fix lock pinning warning during CPU hotplug · c0c8c9fa
      Wanpeng Li 提交于
      The following warning can be triggered by hot-unplugging the CPU
      on which an active SCHED_DEADLINE task is running on:
      
        WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:3531 lock_release+0x690/0x6a0
        releasing a pinned lock
        Call Trace:
         dump_stack+0x99/0xd0
         __warn+0xd1/0xf0
         ? dl_task_timer+0x1a1/0x2b0
         warn_slowpath_fmt+0x4f/0x60
         ? sched_clock+0x13/0x20
         lock_release+0x690/0x6a0
         ? enqueue_pushable_dl_task+0x9b/0xa0
         ? enqueue_task_dl+0x1ca/0x480
         _raw_spin_unlock+0x1f/0x40
         dl_task_timer+0x1a1/0x2b0
         ? push_dl_task.part.31+0x190/0x190
        WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:3649 lock_unpin_lock+0x181/0x1a0
        unpinning an unpinned lock
        Call Trace:
         dump_stack+0x99/0xd0
         __warn+0xd1/0xf0
         warn_slowpath_fmt+0x4f/0x60
         lock_unpin_lock+0x181/0x1a0
         dl_task_timer+0x127/0x2b0
         ? push_dl_task.part.31+0x190/0x190
      
      As per the comment before this code, its safe to drop the RQ lock
      here, and since we (potentially) change rq, unpin and repin to avoid
      the splat.
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      [ Rewrote changelog. ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Juri Lelli <juri.lelli@arm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luca Abeni <luca.abeni@unitn.it>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1470274940-17976-1-git-send-email-wanpeng.li@hotmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      c0c8c9fa
    • G
      sched/cputime: Mitigate performance regression in times()/clock_gettime() · 6075620b
      Giovanni Gherdovich 提交于
      Commit:
      
        6e998916 ("sched/cputime: Fix clock_nanosleep()/clock_gettime() inconsistency")
      
      fixed a problem whereby clock_nanosleep() followed by clock_gettime() could
      allow a task to wake early. It addressed the problem by calling the scheduling
      classes update_curr() when the cputimer starts.
      
      Said change induced a considerable performance regression on the syscalls
      times() and clock_gettimes(CLOCK_PROCESS_CPUTIME_ID). There are some
      debuggers and applications that monitor their own performance that
      accidentally depend on the performance of these specific calls.
      
      This patch mitigates the performace loss by prefetching data in the CPU
      cache, as stalls due to cache misses appear to be where most time is spent
      in our benchmarks.
      
      Here are the performance gain of this patch over v4.7-rc7 on a Sandy Bridge
      box with 32 logical cores and 2 NUMA nodes. The test is repeated with a
      variable number of threads, from 2 to 4*num_cpus; the results are in
      seconds and correspond to the average of 10 runs; the percentage gain is
      computed with (before-after)/before so a positive value is an improvement
      (it's faster). The improvement varies between a few percents for 5-20
      threads and more than 10% for 2 or >20 threads.
      
      pound_clock_gettime:
      
          threads       4.7-rc7     patched 4.7-rc7
          [num]         [secs]      [secs (percent)]
            2           3.48        3.06 ( 11.83%)
            5           3.33        3.25 (  2.40%)
            8           3.37        3.26 (  3.30%)
           12           3.32        3.37 ( -1.60%)
           21           4.01        3.90 (  2.74%)
           30           3.63        3.36 (  7.41%)
           48           3.71        3.11 ( 16.27%)
           79           3.75        3.16 ( 15.74%)
          110           3.81        3.25 ( 14.80%)
          128           3.88        3.31 ( 14.76%)
      
      pound_times:
      
          threads       4.7-rc7     patched 4.7-rc7
          [num]         [secs]      [secs (percent)]
            2           3.65        3.25 ( 11.03%)
            5           3.45        3.17 (  7.92%)
            8           3.52        3.22 (  8.69%)
           12           3.29        3.36 ( -2.04%)
           21           4.07        3.92 (  3.78%)
           30           3.87        3.40 ( 12.17%)
           48           3.79        3.16 ( 16.61%)
           79           3.88        3.28 ( 15.42%)
          110           3.90        3.38 ( 13.35%)
          128           4.00        3.38 ( 15.45%)
      
      pound_clock_gettime and pound_clock_gettime are two benchmarks included in
      the MMTests framework. They launch a given number of threads which
      repeatedly call times() or clock_gettimes(). The results above can be
      reproduced with cloning MMTests from github.com and running the "poundtime"
      workload:
      
        $ git clone https://github.com/gormanm/mmtests.git
        $ cd mmtests
        $ cp configs/config-global-dhp__workload_poundtime config
        $ ./run-mmtests.sh --run-monitor $(uname -r)
      
      The above will run "poundtime" measuring the kernel currently running on
      the machine; Once a new kernel is installed and the machine rebooted,
      running again
      
        $ cd mmtests
        $ ./run-mmtests.sh --run-monitor $(uname -r)
      
      will produce results to compare with. A comparison table will be output
      with:
      
        $ cd mmtests/work/log
        $ ../../compare-kernels.sh
      
      the table will contain a lot of entries; grepping for "Amean" (as in
      "arithmetic mean") will give the tables presented above. The source code
      for the two benchmarks is reported at the end of this changelog for
      clairity.
      
      The cache misses addressed by this patch were found using a combination of
      `perf top`, `perf record` and `perf annotate`. The incriminated lines were
      found to be
      
          struct sched_entity *curr = cfs_rq->curr;
      
      and
      
          delta_exec = now - curr->exec_start;
      
      in the function update_curr() from kernel/sched/fair.c. This patch
      prefetches the data from memory just before update_curr is called in the
      interested execution path.
      
      A comparison of the total number of cycles before and after the patch
      follows; the data is obtained using `perf stat -r 10 -ddd <program>`
      running over the same sequence of number of threads used above (a positive
      gain is an improvement):
      
        threads   cycles before                 cycles after                gain
      
          2      19,699,563,964  +-1.19%      17,358,917,517  +-1.85%      11.88%
          5      47,401,089,566  +-2.96%      45,103,730,829  +-0.97%       4.85%
          8      80,923,501,004  +-3.01%      71,419,385,977  +-0.77%      11.74%
         12     112,326,485,473  +-0.47%     110,371,524,403  +-0.47%       1.74%
         21     193,455,574,299  +-0.72%     180,120,667,904  +-0.36%       6.89%
         30     315,073,519,013  +-1.64%     271,222,225,950  +-1.29%      13.92%
         48     321,969,515,332  +-1.48%     273,353,977,321  +-1.16%      15.10%
         79     337,866,003,422  +-0.97%     289,462,481,538  +-1.05%      14.33%
        110     338,712,691,920  +-0.78%     290,574,233,170  +-0.77%      14.21%
        128     348,384,794,006  +-0.50%     292,691,648,206  +-0.66%      15.99%
      
      A comparison of cache miss vs total cache loads ratios, before and after
      the patch (again from the `perf stat -r 10 -ddd <program>` tables):
      
        threads   L1 misses/total*100     L1 misses/total*100            gain
      		         before                   after
            2           7.43  +-4.90%           7.36  +-4.70%           0.94%
            5          13.09  +-4.74%          13.52  +-3.73%          -3.28%
            8          13.79  +-5.61%          12.90  +-3.27%           6.45%
           12          11.57  +-2.44%           8.71  +-1.40%          24.72%
           21          12.39  +-3.92%           9.97  +-1.84%          19.53%
           30          13.91  +-2.53%          11.73  +-2.28%          15.67%
           48          13.71  +-1.59%          12.32  +-1.97%          10.14%
           79          14.44  +-0.66%          13.40  +-1.06%           7.20%
          110          15.86  +-0.50%          14.46  +-0.59%           8.83%
          128          16.51  +-0.32%          15.06  +-0.78%           8.78%
      
      As a final note, the following shows the evolution of performance figures
      in the "poundtime" benchmark and pinpoints commit 6e998916
      ("sched/cputime: Fix clock_nanosleep()/clock_gettime() inconsistency") as a
      major source of degradation, mostly unaddressed to this day (figures
      expressed in seconds).
      
      pound_clock_gettime:
      
        threads   parent of         6e998916        4.7-rc7
      	    6e998916            itself
          2        2.23          3.68 ( -64.56%)        3.48 (-55.48%)
          5        2.83          3.78 ( -33.42%)        3.33 (-17.43%)
          8        2.84          4.31 ( -52.12%)        3.37 (-18.76%)
          12       3.09          3.61 ( -16.74%)        3.32 ( -7.17%)
          21       3.14          4.63 ( -47.36%)        4.01 (-27.71%)
          30       3.28          5.75 ( -75.37%)        3.63 (-10.80%)
          48       3.02          6.05 (-100.56%)        3.71 (-22.99%)
          79       2.88          6.30 (-118.90%)        3.75 (-30.26%)
          110      2.95          6.46 (-119.00%)        3.81 (-29.24%)
          128      3.05          6.42 (-110.08%)        3.88 (-27.04%)
      
      pound_times:
      
        threads   parent of         6e998916        4.7-rc7
      	    6e998916            itself
          2        2.27          3.73 ( -64.71%)        3.65 (-61.14%)
          5        2.78          3.77 ( -35.56%)        3.45 (-23.98%)
          8        2.79          4.41 ( -57.71%)        3.52 (-26.05%)
          12       3.02          3.56 ( -17.94%)        3.29 ( -9.08%)
          21       3.10          4.61 ( -48.74%)        4.07 (-31.34%)
          30       3.33          5.75 ( -72.53%)        3.87 (-16.01%)
          48       2.96          6.06 (-105.04%)        3.79 (-28.10%)
          79       2.88          6.24 (-116.83%)        3.88 (-34.81%)
          110      2.98          6.37 (-114.08%)        3.90 (-31.12%)
          128      3.10          6.35 (-104.61%)        4.00 (-28.87%)
      
      The source code of the two benchmarks follows. To compile the two:
      
        NR_THREADS=42
        for FILE in pound_times pound_clock_gettime; do
            gcc -lrt -O2 -lpthread -DNUM_THREADS=$NR_THREADS $FILE.c -o $FILE
        done
      
      ==== BEGIN pound_times.c ====
      
      struct tms start;
      
      void *pound (void *threadid)
      {
        struct tms end;
        int oldutime = 0;
        int utime;
        int i;
        for (i = 0; i < 5000000 / NUM_THREADS; i++) {
                times(&end);
                utime = ((int)end.tms_utime - (int)start.tms_utime);
                if (oldutime > utime) {
                  printf("utime decreased, was %d, now %d!\n", oldutime, utime);
                }
                oldutime = utime;
        }
        pthread_exit(NULL);
      }
      
      int main()
      {
        pthread_t th[NUM_THREADS];
        long i;
        times(&start);
        for (i = 0; i < NUM_THREADS; i++) {
          pthread_create (&th[i], NULL, pound, (void *)i);
        }
        pthread_exit(NULL);
        return 0;
      }
      ==== END pound_times.c ====
      
      ==== BEGIN pound_clock_gettime.c ====
      
      void *pound (void *threadid)
      {
      	struct timespec ts;
      	int rc, i;
      	unsigned long prev = 0, this = 0;
      
      	for (i = 0; i < 5000000 / NUM_THREADS; i++) {
      		rc = clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts);
      		if (rc < 0)
      			perror("clock_gettime");
      		this = (ts.tv_sec * 1000000000) + ts.tv_nsec;
      		if (0 && this < prev)
      			printf("%lu ns timewarp at iteration %d\n", prev - this, i);
      		prev = this;
      	}
      	pthread_exit(NULL);
      }
      
      int main()
      {
      	pthread_t th[NUM_THREADS];
      	long rc, i;
      	pid_t pgid;
      
      	for (i = 0; i < NUM_THREADS; i++) {
      		rc = pthread_create(&th[i], NULL, pound, (void *)i);
      		if (rc < 0)
      			perror("pthread_create");
      	}
      
      	pthread_exit(NULL);
      	return 0;
      }
      ==== END pound_clock_gettime.c ====
      Suggested-by: NMike Galbraith <mgalbraith@suse.de>
      Signed-off-by: NGiovanni Gherdovich <ggherdovich@suse.cz>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1470385316-15027-2-git-send-email-ggherdovich@suse.czSigned-off-by: NIngo Molnar <mingo@kernel.org>
      6075620b
    • X
      sched/fair: Fix typo in sync_throttle() · b8922125
      Xunlei Pang 提交于
      We should update cfs_rq->throttled_clock_task, not
      pcfs_rq->throttle_clock_task.
      
      The effects of this bug was probably occasionally erratic
      group scheduling, particularly in cgroups-intense workloads.
      Signed-off-by: NXunlei Pang <xlpang@redhat.com>
      [ Added changelog. ]
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 55e16d30 ("sched/fair: Rework throttle_count sync")
      Link: http://lkml.kernel.org/r/1468050862-18864-1-git-send-email-xlpang@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      b8922125
    • T
      sched/deadline: Fix wrap-around in DL heap · a23eadfa
      Tommaso Cucinotta 提交于
      Current code in cpudeadline.c has a bug in re-heapifying when adding a
      new element at the end of the heap, because a deadline value of 0 is
      temporarily set in the new elem, then cpudl_change_key() is called
      with the actual elem deadline as param.
      
      However, the function compares the new deadline to set with the one
      previously in the elem, which is 0.  So, if current absolute deadlines
      grew so much to have negative values as s64, the comparison in
      cpudl_change_key() makes the wrong decision.  Instead, as from
      dl_time_before(), the kernel should handle correctly abs deadlines
      wrap-arounds.
      
      This patch fixes the problem with a minimally invasive change that
      forces cpudl_change_key() to heapify up in this case.
      Signed-off-by: NTommaso Cucinotta <tommaso.cucinotta@sssup.it>
      Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: NLuca Abeni <luca.abeni@unitn.it>
      Cc: Juri Lelli <juri.lelli@arm.com>
      Cc: Juri Lelli <juri.lelli@gmail.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1468921493-10054-2-git-send-email-tommaso.cucinotta@sssup.itSigned-off-by: NIngo Molnar <mingo@kernel.org>
      a23eadfa
    • L
      Revert "printk: create pr_<level> functions" · a0cba217
      Linus Torvalds 提交于
      This reverts commit 874f9c7d.
      
      Geert Uytterhoeven reports:
       "This change seems to have an (unintendent?) side-effect.
      
        Before, pr_*() calls without a trailing newline characters would be
        printed with a newline character appended, both on the console and in
        the output of the dmesg command.
      
        After this commit, no new line character is appended, and the output
        of the next pr_*() call of the same type may be appended, like in:
      
          - Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000
          - Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)
          + Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)"
      
      Joe Perches says:
       "No, that is not intentional.
      
        The newline handling code inside vprintk_emit is a bit involved and
        for now I suggest a revert until this has all the same behavior as
        earlier"
      Reported-by: NGeert Uytterhoeven <geert@linux-m68k.org>
      Requested-by: NJoe Perches <joe@perches.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      a0cba217
  2. 09 8月, 2016 1 次提交
  3. 08 8月, 2016 1 次提交
    • J
      block: rename bio bi_rw to bi_opf · 1eff9d32
      Jens Axboe 提交于
      Since commit 63a4cc24, bio->bi_rw contains flags in the lower
      portion and the op code in the higher portions. This means that
      old code that relies on manually setting bi_rw is most likely
      going to be broken. Instead of letting that brokeness linger,
      rename the member, to force old and out-of-tree code to break
      at compile time instead of at runtime.
      
      No intended functional changes in this commit.
      Signed-off-by: NJens Axboe <axboe@fb.com>
      1eff9d32
  4. 04 8月, 2016 6 次提交
    • J
      jump_label: remove bug.h, atomic.h dependencies for HAVE_JUMP_LABEL · 1f69bf9c
      Jason Baron 提交于
      The current jump_label.h includes bug.h for things such as WARN_ON().
      This makes the header problematic for inclusion by kernel.h or any
      headers that kernel.h includes, since bug.h includes kernel.h (circular
      dependency).  The inclusion of atomic.h is similarly problematic.  Thus,
      this should make jump_label.h 'includable' from most places.
      
      Link: http://lkml.kernel.org/r/7060ce35ddd0d20b33bf170685e6b0fab816bdf2.1467837322.git.jbaron@akamai.comSigned-off-by: NJason Baron <jbaron@akamai.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Chris Metcalf <cmetcalf@mellanox.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      1f69bf9c
    • M
      tree-wide: replace config_enabled() with IS_ENABLED() · 97f2645f
      Masahiro Yamada 提交于
      The use of config_enabled() against config options is ambiguous.  In
      practical terms, config_enabled() is equivalent to IS_BUILTIN(), but the
      author might have used it for the meaning of IS_ENABLED().  Using
      IS_ENABLED(), IS_BUILTIN(), IS_MODULE() etc.  makes the intention
      clearer.
      
      This commit replaces config_enabled() with IS_ENABLED() where possible.
      This commit is only touching bool config options.
      
      I noticed two cases where config_enabled() is used against a tristate
      option:
      
       - config_enabled(CONFIG_HWMON)
        [ drivers/net/wireless/ath/ath10k/thermal.c ]
      
       - config_enabled(CONFIG_BACKLIGHT_CLASS_DEVICE)
        [ drivers/gpu/drm/gma500/opregion.c ]
      
      I did not touch them because they should be converted to IS_BUILTIN()
      in order to keep the logic, but I was not sure it was the authors'
      intention.
      
      Link: http://lkml.kernel.org/r/1465215656-20569-1-git-send-email-yamada.masahiro@socionext.comSigned-off-by: NMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Stas Sergeev <stsp@list.ru>
      Cc: Matt Redfearn <matt.redfearn@imgtec.com>
      Cc: Joshua Kinard <kumba@gentoo.org>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: "Dmitry V. Levin" <ldv@altlinux.org>
      Cc: yu-cheng yu <yu-cheng.yu@intel.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Will Drewry <wad@chromium.org>
      Cc: Nikolay Martynov <mar.kolya@gmail.com>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
      Cc: Rafal Milecki <zajec5@gmail.com>
      Cc: James Cowgill <James.Cowgill@imgtec.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Alex Smith <alex.smith@imgtec.com>
      Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
      Cc: Qais Yousef <qais.yousef@imgtec.com>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Mikko Rapeli <mikko.rapeli@iki.fi>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: "Luis R. Rodriguez" <mcgrof@do-not-panic.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Roland McGrath <roland@hack.frob.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Kalle Valo <kvalo@qca.qualcomm.com>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Tony Wu <tung7970@gmail.com>
      Cc: Huaitong Han <huaitong.han@intel.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrea Gelmini <andrea.gelmini@gelma.net>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Rabin Vincent <rabin@rab.in>
      Cc: "Maciej W. Rozycki" <macro@imgtec.com>
      Cc: David Daney <david.daney@cavium.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      97f2645f
    • J
      modules: add ro_after_init support · 444d13ff
      Jessica Yu 提交于
      Add ro_after_init support for modules by adding a new page-aligned section
      in the module layout (after rodata) for ro_after_init data and enabling RO
      protection for that section after module init runs.
      Signed-off-by: NJessica Yu <jeyu@redhat.com>
      Acked-by: NKees Cook <keescook@chromium.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      444d13ff
    • R
      jump_label: disable preemption around __module_text_address(). · bdc9f373
      Rusty Russell 提交于
      Steven reported a warning caused by not holding module_mutex or
      rcu_read_lock_sched: his backtrace was corrupted but a quick audit
      found this possible cause.  It's wrong anyway...
      Reported-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      bdc9f373
    • P
      modules: Add kernel parameter to blacklist modules · be7de5f9
      Prarit Bhargava 提交于
      Blacklisting a module in linux has long been a problem.  The current
      procedure is to use rd.blacklist=module_name, however, that doesn't
      cover the case after the initramfs and before a boot prompt (where one
      is supposed to use /etc/modprobe.d/blacklist.conf to blacklist
      runtime loading). Using rd.shell to get an early prompt is hit-or-miss,
      and doesn't cover all situations AFAICT.
      
      This patch adds this functionality of permanently blacklisting a module
      by its name via the kernel parameter module_blacklist=module_name.
      
      [v2]: Rusty, use core_param() instead of __setup() which simplifies
      things.
      
      [v3]: Rusty, undo wreckage from strsep()
      
      [v4]: Rusty, simpler version of blacklisted()
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: linux-doc@vger.kernel.org
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      be7de5f9
    • S
      module: Do a WARN_ON_ONCE() for assert module mutex not held · 9502514f
      Steven Rostedt 提交于
      When running with lockdep enabled, I triggered the WARN_ON() in the
      module code that asserts when module_mutex or rcu_read_lock_sched are
      not held. The issue I have is that this can also be called from the
      dump_stack() code, causing us to enter an infinite loop...
      
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 0 at kernel/module.c:268 module_assert_mutex_or_preempt+0x3c/0x3e
       Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc3-test-00013-g501c2375 #14
       Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
        ffff880215e8fa70 ffff880215e8fa70 ffffffff812fc8e3 0000000000000000
        ffffffff81d3e55b ffff880215e8fac0 ffffffff8104fc88 ffffffff8104fcab
        0000000915e88300 0000000000000046 ffffffffa019b29a 0000000000000001
       Call Trace:
        [<ffffffff812fc8e3>] dump_stack+0x67/0x90
        [<ffffffff8104fc88>] __warn+0xcb/0xe9
        [<ffffffff8104fcab>] ? warn_slowpath_null+0x5/0x1f
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 0 at kernel/module.c:268 module_assert_mutex_or_preempt+0x3c/0x3e
       Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc3-test-00013-g501c2375 #14
       Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
        ffff880215e8f7a0 ffff880215e8f7a0 ffffffff812fc8e3 0000000000000000
        ffffffff81d3e55b ffff880215e8f7f0 ffffffff8104fc88 ffffffff8104fcab
        0000000915e88300 0000000000000046 ffffffffa019b29a 0000000000000001
       Call Trace:
        [<ffffffff812fc8e3>] dump_stack+0x67/0x90
        [<ffffffff8104fc88>] __warn+0xcb/0xe9
        [<ffffffff8104fcab>] ? warn_slowpath_null+0x5/0x1f
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 0 at kernel/module.c:268 module_assert_mutex_or_preempt+0x3c/0x3e
       Modules linked in: ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc3-test-00013-g501c2375 #14
       Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
        ffff880215e8f4d0 ffff880215e8f4d0 ffffffff812fc8e3 0000000000000000
        ffffffff81d3e55b ffff880215e8f520 ffffffff8104fc88 ffffffff8104fcab
        0000000915e88300 0000000000000046 ffffffffa019b29a 0000000000000001
       Call Trace:
        [<ffffffff812fc8e3>] dump_stack+0x67/0x90
        [<ffffffff8104fc88>] __warn+0xcb/0xe9
        [<ffffffff8104fcab>] ? warn_slowpath_null+0x5/0x1f
       ------------[ cut here ]------------
       WARNING: CPU: 1 PID: 0 at kernel/module.c:268 module_assert_mutex_or_preempt+0x3c/0x3e
      [...]
      
      Which gives us rather useless information. Worse yet, there's some race
      that causes this, and I seldom trigger it, so I have no idea what
      happened.
      
      This would not be an issue if that warning was a WARN_ON_ONCE().
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      9502514f
  5. 03 8月, 2016 20 次提交
  6. 02 8月, 2016 1 次提交
    • D
      perf/core: Change log level for duration warning to KERN_INFO · 0d87d7ec
      David Ahern 提交于
      When the perf interrupt handler exceeds a threshold warning messages
      are displayed on console:
      
        [12739.31793] perf interrupt took too long (2504 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
        [71340.165065] perf interrupt took too long (5005 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
      
      Many customers and users are confused by the message wondering if
      something is wrong or they need to take action to fix a problem.
      Since a user can not do anything to fix the issue, the message is really
      more informational than a warning. Adjust the log level accordingly.
      Signed-off-by: NDavid Ahern <dsa@cumulusnetworks.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1470084569-438-1-git-send-email-dsa@cumulusnetworks.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0d87d7ec
  7. 01 8月, 2016 1 次提交
  8. 29 7月, 2016 5 次提交