1. 13 11月, 2017 5 次提交
  2. 12 11月, 2017 3 次提交
    • D
      timers: Add a function to start/reduce a timer · b24591e2
      David Howells 提交于
      Add a function, similar to mod_timer(), that will start a timer if it isn't
      running and will modify it if it is running and has an expiry time longer
      than the new time.  If the timer is running with an expiry time that's the
      same or sooner, no change is made.
      
      The function looks like:
      
      	int timer_reduce(struct timer_list *timer, unsigned long expires);
      
      This can be used by code such as networking code to make it easier to share
      a timer for multiple timeouts.  For instance, in upcoming AF_RXRPC code,
      the rxrpc_call struct will maintain a number of timeouts:
      
      	unsigned long	ack_at;
      	unsigned long	resend_at;
      	unsigned long	ping_at;
      	unsigned long	expect_rx_by;
      	unsigned long	expect_req_by;
      	unsigned long	expect_term_by;
      
      each of which is set independently of the others.  With timer reduction
      available, when the code needs to set one of the timeouts, it only needs to
      look at that timeout and then call timer_reduce() to modify the timer,
      starting it or bringing it forward if necessary.  There is no need to refer
      to the other timeouts to see which is earliest and no need to take any lock
      other than, potentially, the timer lock inside timer_reduce().
      
      Note, that this does not protect against concurrent invocations of any of
      the timer functions.
      
      As an example, the expect_rx_by timeout above, which terminates a call if
      we don't get a packet from the server within a certain time window, would
      be set something like this:
      
      	unsigned long now = jiffies;
      	unsigned long expect_rx_by = now + packet_receive_timeout;
      	WRITE_ONCE(call->expect_rx_by, expect_rx_by);
      	timer_reduce(&call->timer, expect_rx_by);
      
      The timer service code (which might, say, be in a work function) would then
      check all the timeouts to see which, if any, had triggered, deal with
      those:
      
      	t = READ_ONCE(call->ack_at);
      	if (time_after_eq(now, t)) {
      		cmpxchg(&call->ack_at, t, now + MAX_JIFFY_OFFSET);
      		set_bit(RXRPC_CALL_EV_ACK, &call->events);
      	}
      
      and then restart the timer if necessary by finding the soonest timeout that
      hasn't yet passed and then calling timer_reduce().
      
      The disadvantage of doing things this way rather than comparing the timers
      each time and calling mod_timer() is that you *will* take timer events
      unless you can finish what you're doing and delete the timer in time.
      
      The advantage of doing things this way is that you don't need to use a lock
      to work out when the next timer should be set, other than the timer's own
      lock - which you might not have to take.
      
      [ tglx: Fixed weird formatting and adopted it to pending changes ]
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: keyrings@vger.kernel.org
      Cc: linux-afs@lists.infradead.org
      Link: https://lkml.kernel.org/r/151023090769.23050.1801643667223880753.stgit@warthog.procyon.org.uk
      b24591e2
    • A
      pstore: Use ktime_get_real_fast_ns() instead of __getnstimeofday() · df27067e
      Arnd Bergmann 提交于
      __getnstimeofday() is a rather odd interface, with a number of quirks:
      
      - The caller may come from NMI context, but the implementation is not NMI safe,
        one way to get there from NMI is
      
            NMI handler:
              something bad
                panic()
                  kmsg_dump()
                    pstore_dump()
                       pstore_record_init()
                         __getnstimeofday()
      
      - The calling conventions are different from any other timekeeping functions,
        to deal with returning an error code during suspended timekeeping.
      
      Address the above issues by using a completely different method to get the
      time: ktime_get_real_fast_ns() is NMI safe and has a reasonable behavior
      when timekeeping is suspended: it returns the time at which it got
      suspended. As Thomas Gleixner explained, this is safe, as
      ktime_get_real_fast_ns() does not call into the clocksource driver that
      might be suspended.
      
      The result can easily be transformed into a timespec structure. Since
      ktime_get_real_fast_ns() was not exported to modules, add the export.
      
      The pstore behavior for the suspended case changes slightly, as it now
      stores the timestamp at which timekeeping was suspended instead of storing
      a zero timestamp.
      
      This change is not addressing y2038-safety, that's subject to a more
      complex follow up patch.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Acked-by: NKees Cook <keescook@chromium.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Anton Vorontsov <anton@enomsg.org>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Colin Cross <ccross@android.com>
      Link: https://lkml.kernel.org/r/20171110152530.1926955-1-arnd@arndb.de
      df27067e
    • T
      irq/work: Use llist_for_each_entry_safe · d00a08cf
      Thomas Gleixner 提交于
      The llist_for_each_entry() loop in irq_work_run_list() is unsafe because
      once the works PENDING bit is cleared it can be requeued on another CPU.
      
      Use llist_for_each_entry_safe() instead.
      
      Fixes: 16c0890d ("irq/work: Don't reinvent the wheel but use existing llist API")
      Reported-by: NChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Petri Latvala <petri.latvala@intel.com>
      Link: http://lkml.kernel.org/r/151027307351.14762.4611888896020658384@mail.alporthouse.com
      d00a08cf
  3. 09 11月, 2017 5 次提交
    • P
      sched/core: Optimize sched_feat() for !CONFIG_SCHED_DEBUG builds · 765cc3a4
      Patrick Bellasi 提交于
      When the kernel is compiled with !CONFIG_SCHED_DEBUG support, we expect that
      all SCHED_FEAT are turned into compile time constants being propagated
      to support compiler optimizations.
      
      Specifically, we expect that code blocks like this:
      
         if (sched_feat(FEATURE_NAME) [&& <other_conditions>]) {
      	/* FEATURE CODE */
         }
      
      are turned into dead-code in case FEATURE_NAME defaults to FALSE, and thus
      being removed by the compiler from the finale image.
      
      For this mechanism to properly work it's required for the compiler to
      have full access, from each translation unit, to whatever is the value
      defined by the sched_feat macro. This macro is defined as:
      
         #define sched_feat(x) (sysctl_sched_features & (1UL << __SCHED_FEAT_##x))
      
      and thus, the compiler can optimize that code only if the value of
      sysctl_sched_features is visible within each translation unit.
      
      Since:
      
         029632fb ("sched: Make separate sched*.c translation units")
      
      the scheduler code has been split into separate translation units
      however the definition of sysctl_sched_features is part of
      kernel/sched/core.c while, for all the other scheduler modules, it is
      visible only via kernel/sched/sched.h as an:
      
         extern const_debug unsigned int sysctl_sched_features
      
      Unfortunately, an extern reference does not allow the compiler to apply
      constants propagation. Thus, on !CONFIG_SCHED_DEBUG kernel we still end up
      with code to load a memory reference and (eventually) doing an unconditional
      jump of a chunk of code.
      
      This mechanism is unavoidable when sched_features can be turned on and off at
      run-time. However, this is not the case for "production" kernels compiled with
      !CONFIG_SCHED_DEBUG. In this case, sysctl_sched_features is just a constant value
      which cannot be changed at run-time and thus memory loads and jumps can be
      avoided altogether.
      
      This patch fixes the case of !CONFIG_SCHED_DEBUG kernel by declaring a local version
      of the sysctl_sched_features constant for each translation unit. This will
      ultimately allow the compiler to perform constants propagation and dead-code
      pruning.
      
      Tests have been done, with !CONFIG_SCHED_DEBUG on a v4.14-rc8 with and without
      the patch, by running 30 iterations of:
      
         perf bench sched messaging --pipe --thread --group 4 --loop 50000
      
      on a 40 cores Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz using the
      powersave governor to rule out variations due to frequency scaling.
      
      Statistics on the reported completion time:
      
                         count     mean       std     min       99%     max
        v4.14-rc8         30.0  15.7831  0.176032  15.442  16.01226  16.014
        v4.14-rc8+patch   30.0  15.5033  0.189681  15.232  15.93938  15.962
      
      ... show a 1.8% speedup on average completion time and 0.5% speedup in the
      99 percentile.
      Signed-off-by: NPatrick Bellasi <patrick.bellasi@arm.com>
      Signed-off-by: NChris Redpath <chris.redpath@arm.com>
      Reviewed-by: NDietmar Eggemann <dietmar.eggemann@arm.com>
      Reviewed-by: NBrendan Jackman <brendan.jackman@arm.com>
      Acked-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Morten Rasmussen <morten.rasmussen@arm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Link: http://lkml.kernel.org/r/20171108184101.16006-1-patrick.bellasi@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      765cc3a4
    • V
      cpufreq: schedutil: Reset cached_raw_freq when not in sync with next_freq · 07458f6a
      Viresh Kumar 提交于
      'cached_raw_freq' is used to get the next frequency quickly but should
      always be in sync with sg_policy->next_freq. There is a case where it is
      not and in such cases it should be reset to avoid switching to incorrect
      frequencies.
      
      Consider this case for example:
      
       - policy->cur is 1.2 GHz (Max)
       - New request comes for 780 MHz and we store that in cached_raw_freq.
       - Based on 780 MHz, we calculate the effective frequency as 800 MHz.
       - We then see the CPU wasn't idle recently and choose to keep the next
         freq as 1.2 GHz.
       - Now we have cached_raw_freq is 780 MHz and sg_policy->next_freq is
         1.2 GHz.
       - Now if the utilization doesn't change in then next request, then the
         next target frequency will still be 780 MHz and it will match with
         cached_raw_freq. But we will choose 1.2 GHz instead of 800 MHz here.
      
      Fixes: b7eaf1aa (cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely)
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      Cc: 4.12+ <stable@vger.kernel.org> # 4.12+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      07458f6a
    • R
      PM / s2idle: Clear the events_check_enabled flag · 95b982b4
      Rajat Jain 提交于
      Problem: This flag does not get cleared currently in the suspend or
      resume path in the following cases:
      
       * In case some driver's suspend routine returns an error.
       * Successful s2idle case
       * etc?
      
      Why is this a problem: What happens is that the next suspend attempt
      could fail even though the user did not enable the flag by writing to
      /sys/power/wakeup_count. This is 1 use case how the issue can be seen
      (but similar use case with driver suspend failure can be thought of):
      
       1. Read /sys/power/wakeup_count
       2. echo count > /sys/power/wakeup_count
       3. echo freeze > /sys/power/wakeup_count
       4. Let the system suspend, and wakeup the system using some wake source
          that calls pm_wakeup_event() e.g. power button or something.
       5. Note that the combined wakeup count would be incremented due
          to the pm_wakeup_event() in the resume path.
       6. After resuming the events_check_enabled flag is still set.
      
      At this point if the user attempts to freeze again (without writing to
      /sys/power/wakeup_count), the suspend would fail even though there has
      been no wake event since the past resume.
      
      Address that by clearing the flag just before a resume is completed,
      so that it is always cleared for the corner cases mentioned above.
      Signed-off-by: NRajat Jain <rajatja@google.com>
      Acked-by: NPavel Machek <pavel@ucw.cz>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      95b982b4
    • L
      stop using '%pK' for /proc/kallsyms pointer values · c0f3ea15
      Linus Torvalds 提交于
      Not only is it annoying to have one single flag for all pointers, as if
      that was a global choice and all kernel pointers are the same, but %pK
      can't get the 'access' vs 'open' time check right anyway.
      
      So make the /proc/kallsyms pointer value code use logic specific to that
      particular file.  We do continue to honor kptr_restrict, but the default
      (which is unrestricted) is changed to instead take expected users into
      account, and restrict access by default.
      
      Right now the only actual expected user is kernel profiling, which has a
      separate sysctl flag for kernel profile access.  There may be others.
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c0f3ea15
    • B
      module: export module signature enforcement status · fda784e5
      Bruno E. O. Meneguele 提交于
      A static variable sig_enforce is used as status var to indicate the real
      value of CONFIG_MODULE_SIG_FORCE, once this one is set the var will hold
      true, but if the CONFIG is not set the status var will hold whatever
      value is present in the module.sig_enforce kernel cmdline param: true
      when =1 and false when =0 or not present.
      
      Considering this cmdline param take place over the CONFIG value when
      it's not set, other places in the kernel could misbehave since they
      would have only the CONFIG_MODULE_SIG_FORCE value to rely on. Exporting
      this status var allows the kernel to rely in the effective value of
      module signature enforcement, being it from CONFIG value or cmdline
      param.
      Signed-off-by: NBruno E. O. Meneguele <brdeoliv@redhat.com>
      Signed-off-by: NMimi Zohar <zohar@linux.vnet.ibm.com>
      fda784e5
  4. 08 11月, 2017 12 次提交
  5. 07 11月, 2017 5 次提交
  6. 05 11月, 2017 1 次提交
  7. 03 11月, 2017 1 次提交
    • K
      rcu: Convert timers to use timer_setup() · fd30b717
      Kees Cook 提交于
      In preparation for unconditionally passing the struct timer_list pointer to
      all timer callbacks, switch to using the new timer_setup() and from_timer()
      to pass the timer pointer explicitly.
      
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Signed-off-by: NKees Cook <keescook@chromium.org>
      fd30b717
  8. 02 11月, 2017 7 次提交
    • J
      futex: futex_wake_op, do not fail on invalid op · e78c38f6
      Jiri Slaby 提交于
      In commit 30d6e0a4 ("futex: Remove duplicated code and fix undefined
      behaviour"), I let FUTEX_WAKE_OP to fail on invalid op.  Namely when op
      should be considered as shift and the shift is out of range (< 0 or > 31).
      
      But strace's test suite does this madness:
      
        futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xa0caffee);
        futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xbadfaced);
        futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xffffffff);
      
      When I pick the first 0xa0caffee, it decodes as:
      
        0x80000000 & 0xa0caffee: oparg is shift
        0x70000000 & 0xa0caffee: op is FUTEX_OP_OR
        0x0f000000 & 0xa0caffee: cmp is FUTEX_OP_CMP_EQ
        0x00fff000 & 0xa0caffee: oparg is sign-extended 0xcaf = -849
        0x00000fff & 0xa0caffee: cmparg is sign-extended 0xfee = -18
      
      That means the op tries to do this:
      
        (futex |= (1 << (-849))) == -18
      
      which is completely bogus. The new check of op in the code is:
      
              if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28)) {
                      if (oparg < 0 || oparg > 31)
                              return -EINVAL;
                      oparg = 1 << oparg;
              }
      
      which results obviously in the "Invalid argument" errno:
      
        FAIL: futex
        ===========
      
        futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xa0caffee) = -1: Invalid argument
        futex.test: failed test: ../futex failed with code 1
      
      So let us soften the failure to print only a (ratelimited) message, crop
      the value and continue as if it were right.  When userspace keeps up, we
      can switch this to return -EINVAL again.
      
      [v2] Do not return 0 immediatelly, proceed with the cropped value.
      
      Fixes: 30d6e0a4 ("futex: Remove duplicated code and fix undefined behaviour")
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Darren Hart <dvhart@infradead.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e78c38f6
    • R
      kernel/time/Kconfig: Fix typo in comment · 6082a6e4
      Randy Dunlap 提交于
      Fix typo in Kconfig comment text.
      Signed-off-by: NRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: Jiri Kosina <trivial@kernel.org>
      Link: https://lkml.kernel.org/r/0e586dd4-2b27-864e-c252-bc72df52fd01@infradead.org
      6082a6e4
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
    • A
      signal: Fix name of SIGEMT in #if defined() check · c3aff086
      Andrew Clayton 提交于
      Commit cc731525 ("signal: Remove kernel interal si_code magic")
      added a check for SIGMET and NSIGEMT being defined. That SIGMET should
      in fact be SIGEMT, with SIGEMT being defined in
      arch/{alpha,mips,sparc}/include/uapi/asm/signal.h
      
      This was actually pointed out by BenHutchings in a lwn.net comment
      here https://lwn.net/Comments/734608/
      
      Fixes: cc731525 ("signal: Remove kernel interal si_code magic")
      Signed-off-by: NAndrew Clayton <andrew@digital-domain.net>
      Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      c3aff086
    • D
      watchdog/hardlockup/perf: Use atomics to track in-use cpu counter · 42f930da
      Don Zickus 提交于
      Guenter reported:
        There is still a problem. When running 
          echo 6 > /proc/sys/kernel/watchdog_thresh
          echo 5 > /proc/sys/kernel/watchdog_thresh
        repeatedly, the message
       
         NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
       
        stops after a while (after ~10-30 iterations, with fluctuations).
        Maybe watchdog_cpus needs to be atomic ?
      
      That's correct as this again is affected by the asynchronous nature of the
      smpboot thread unpark mechanism.
      
      CPU 0				CPU1			CPU2
      write(watchdog_thresh, 6)	
        stop()
          park()
        update()
        start()
          unpark()
      				thread->unpark()
      				  cnt++;
      write(watchdog_thresh, 5)				thread->unpark()
        stop()
          park()			thread->park()
      				   cnt--;		  cnt++;
        update()
        start()
          unpark()
      
      That's not a functional problem, it just affects the informational message.
      
      Convert watchdog_cpus to atomic_t to prevent the problem
      Reported-and-tested-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20171101181126.j727fqjmdthjz4xk@redhat.com
      
      42f930da
    • T
      watchdog/harclockup/perf: Revert a33d4484 ("watchdog/hardlockup/perf:... · 9c388a5e
      Thomas Gleixner 提交于
      watchdog/harclockup/perf: Revert a33d4484 ("watchdog/hardlockup/perf: Simplify deferred event destroy")
      
      Guenter reported a crash in the watchdog/perf code, which is caused by
      cleanup() and enable() running concurrently. The reason for this is:
      
      The watchdog functions are serialized via the watchdog_mutex and cpu
      hotplug locking, but the enable of the perf based watchdog happens in
      context of the unpark callback of the smpboot thread. But that unpark
      function is not synchronous inside the locking. The unparking of the thread
      just wakes it up and leaves so there is no guarantee when the thread is
      executing.
      
      If it starts running _before_ the cleanup happened then it will create a
      event and overwrite the dead event pointer. The new event is then cleaned
      up because the event is marked dead.
      
          lock(watchdog_mutex);
          lockup_detector_reconfigure();
              cpus_read_lock();
      	stop();
      	   park()
      	update();
      	start();
      	   unpark()
      	cpus_read_unlock();		thread runs()
      					  overwrite dead event ptr
      	cleanup();
      	  free new event, which is active inside perf....
          unlock(watchdog_mutex);
      
      The park side is safe as that actually waits for the thread to reach
      parked state.
      
      Commit a33d4484 removed the protection against this kind of scenario
      under the stupid assumption that the hotplug serialization and the
      watchdog_mutex cover everything. 
      
      Bring it back.
      
      Reverts: a33d4484 ("watchdog/hardlockup/perf: Simplify deferred event destroy")
      Reported-and-tested-by: NGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: NThomas Feels-stupid Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Don Zickus <dzickus@redhat.com>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710312145190.1942@nanos
      
      
      9c388a5e
    • P
      clockevents: Update clockevents device next_event on stop · 39c82caf
      Prasad Sodagudi 提交于
      clockevent_device::next_event holds the next timer event of a clock event
      device. The value is updated in clockevents_program_event(), i.e. when the
      hardware timer is armed for the next expiry.
      
      When there are no software timers armed on a CPU, the corresponding per CPU
      clockevent device is brought into ONESHOT_STOPPED state, but
      clockevent_device::next_event is not updated, because
      clockevents_program_event() is not called.
      
      So the content of clockevent_device::next_event is stale, which is not an
      issue when real hardware is used. But the hrtimer broadcast device relies
      on that information and the stale value causes spurious wakeups.
      
      Update clockevent_device::next_event to KTIME_MAX when it has been brought
      into ONESHOT_STOPPED state to avoid spurious wakeups. This reflects the
      proper expiry time of the stopped timer: infinity.
      
      [ tglx: Massaged changelog ]
      Signed-off-by: NPrasad Sodagudi <psodagud@codeaurora.org>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: viresh.kumar@linaro.org
      Link: https://lkml.kernel.org/r/1509043042-32486-1-git-send-email-psodagud@codeaurora.org
      39c82caf
  9. 01 11月, 2017 1 次提交