1. 09 6月, 2020 4 次提交
    • G
      kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases · f117955a
      Guilherme G. Piccoli 提交于
      After a recent change introduced by Vlastimil's series [0], kernel is
      able now to handle sysctl parameters on kernel command line; also, the
      series introduced a simple infrastructure to convert legacy boot
      parameters (that duplicate sysctls) into sysctl aliases.
      
      This patch converts the watchdog parameters softlockup_panic and
      {hard,soft}lockup_all_cpu_backtrace to use the new alias infrastructure.
      It fixes the documentation too, since the alias only accepts values 0 or
      1, not the full range of integers.
      
      We also took the opportunity here to improve the documentation of the
      previously converted hung_task_panic (see the patch series [0]) and put
      the alias table in alphabetical order.
      
      [0] http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.czSigned-off-by: NGuilherme G. Piccoli <gpiccoli@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Iurii Zaikin <yzaikin@google.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Link: http://lkml.kernel.org/r/20200507214624.21911-1-gpiccoli@canonical.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f117955a
    • V
      kernel/hung_task convert hung_task_panic boot parameter to sysctl · b467f3ef
      Vlastimil Babka 提交于
      We can now handle sysctl parameters on kernel command line and have
      infrastructure to convert legacy command line options that duplicate
      sysctl to become a sysctl alias.
      
      This patch converts the hung_task_panic parameter.  Note that the sysctl
      handler is more strict and allows only 0 and 1, while the legacy
      parameter allowed any non-zero value.  But there is little reason anyone
      would not be using 1.
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com>
      Cc: Iurii Zaikin <yzaikin@google.com>
      Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20200427180433.7029-4-vbabka@suse.czSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b467f3ef
    • V
      kernel/sysctl: support setting sysctl parameters from kernel command line · 3db978d4
      Vlastimil Babka 提交于
      Patch series "support setting sysctl parameters from kernel command line", v3.
      
      This series adds support for something that seems like many people
      always wanted but nobody added it yet, so here's the ability to set
      sysctl parameters via kernel command line options in the form of
      sysctl.vm.something=1
      
      The important part is Patch 1.  The second, not so important part is an
      attempt to clean up legacy one-off parameters that do the same thing as
      a sysctl.  I don't want to remove them completely for compatibility
      reasons, but with generic sysctl support the idea is to remove the
      one-off param handlers and treat the parameters as aliases for the
      sysctl variants.
      
      I have identified several parameters that mention sysctl counterparts in
      Documentation/admin-guide/kernel-parameters.txt but there might be more.
      The conversion also has varying level of success:
      
       - numa_zonelist_order is converted in Patch 2 together with adding the
         necessary infrastructure. It's easy as it doesn't really do anything
         but warn on deprecated value these days.
      
       - hung_task_panic is converted in Patch 3, but there's a downside that
         now it only accepts 0 and 1, while previously it was any integer
         value
      
       - nmi_watchdog maps to two sysctls nmi_watchdog and hardlockup_panic,
         so there's no straighforward conversion possible
      
       - traceoff_on_warning is a flag without value and it would be required
         to handle that somehow in the conversion infractructure, which seems
         pointless for a single flag
      
      This patch (of 5):
      
      A recently proposed patch to add vm_swappiness command line parameter in
      addition to existing sysctl [1] made me wonder why we don't have a
      general support for passing sysctl parameters via command line.
      
      Googling found only somebody else wondering the same [2], but I haven't
      found any prior discussion with reasons why not to do this.
      
      Settings the vm_swappiness issue aside (the underlying issue might be
      solved in a different way), quick search of kernel-parameters.txt shows
      there are already some that exist as both sysctl and kernel parameter -
      hung_task_panic, nmi_watchdog, numa_zonelist_order, traceoff_on_warning.
      
      A general mechanism would remove the need to add more of those one-offs
      and might be handy in situations where configuration by e.g.
      /etc/sysctl.d/ is impractical.
      
      Hence, this patch adds a new parse_args() pass that looks for parameters
      prefixed by 'sysctl.' and tries to interpret them as writes to the
      corresponding sys/ files using an temporary in-kernel procfs mount.
      This mechanism was suggested by Eric W.  Biederman [3], as it handles
      all dynamically registered sysctl tables, even though we don't handle
      modular sysctls.  Errors due to e.g.  invalid parameter name or value
      are reported in the kernel log.
      
      The processing is hooked right before the init process is loaded, as
      some handlers might be more complicated than simple setters and might
      need some subsystems to be initialized.  At the moment the init process
      can be started and eventually execute a process writing to /proc/sys/
      then it should be also fine to do that from the kernel.
      
      Sysctls registered later on module load time are not set by this
      mechanism - it's expected that in such scenarios, setting sysctl values
      from userspace is practical enough.
      
      [1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/
      [2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter
      [3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NLuis Chamberlain <mcgrof@kernel.org>
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NKees Cook <keescook@chromium.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Iurii Zaikin <yzaikin@google.com>
      Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Link: http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz
      Link: http://lkml.kernel.org/r/20200427180433.7029-2-vbabka@suse.czSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3db978d4
    • R
      kernel: add panic_on_taint · db38d5c1
      Rafael Aquini 提交于
      Analogously to the introduction of panic_on_warn, this patch introduces
      a kernel option named panic_on_taint in order to provide a simple and
      generic way to stop execution and catch a coredump when the kernel gets
      tainted by any given flag.
      
      This is useful for debugging sessions as it avoids having to rebuild the
      kernel to explicitly add calls to panic() into the code sites that
      introduce the taint flags of interest.
      
      For instance, if one is interested in proceeding with a post-mortem
      analysis at the point a given code path is hitting a bad page (i.e.
      unaccount_page_cache_page(), or slab_bug()), a coredump can be collected
      by rebooting the kernel with 'panic_on_taint=0x20' amended to the
      command line.
      
      Another, perhaps less frequent, use for this option would be as a means
      for assuring a security policy case where only a subset of taints, or no
      single taint (in paranoid mode), is allowed for the running system.  The
      optional switch 'nousertaint' is handy in this particular scenario, as
      it will avoid userspace induced crashes by writes to sysctl interface
      /proc/sys/kernel/tainted causing false positive hits for such policies.
      
      [akpm@linux-foundation.org: tweak kernel-parameters.txt wording]
      Suggested-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NRafael Aquini <aquini@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NLuis Chamberlain <mcgrof@kernel.org>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Adrian Bunk <bunk@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Takashi Iwai <tiwai@suse.de>
      Link: http://lkml.kernel.org/r/20200515175502.146720-1-aquini@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      db38d5c1
  2. 04 6月, 2020 1 次提交
    • M
      hugetlbfs: clean up command line processing · 282f4214
      Mike Kravetz 提交于
      With all hugetlb page processing done in a single file clean up code.
      
      - Make code match desired semantics
        - Update documentation with semantics
      - Make all warnings and errors messages start with 'HugeTLB:'.
      - Consistently name command line parsing routines.
      - Warn if !hugepages_supported() and command line parameters have
        been specified.
      - Add comments to code
        - Describe some of the subtle interactions
        - Describe semantics of command line arguments
      
      This patch also fixes issues with implicitly setting the number of
      gigantic huge pages to preallocate.  Previously on X86 command line,
      
              hugepages=2 default_hugepagesz=1G
      
      would result in zero 1G pages being preallocated and,
      
              # grep HugePages_Total /proc/meminfo
              HugePages_Total:       0
              # sysctl -a | grep nr_hugepages
              vm.nr_hugepages = 2
              vm.nr_hugepages_mempolicy = 2
              # cat /proc/sys/vm/nr_hugepages
              2
      
      After this patch 2 gigantic pages will be preallocated and all the proc,
      sysfs, sysctl and meminfo files will accurately reflect this.
      
      To address the issue with gigantic pages, a small change in behavior was
      made to command line processing.  Previously the command line,
      
              hugepages=128 default_hugepagesz=2M hugepagesz=2M hugepages=256
      
      would result in the allocation of 256 2M huge pages.  The value 128 would
      be ignored without any warning.  After this patch, 128 2M pages will be
      allocated and a warning message will be displayed indicating the value of
      256 is ignored.  This change in behavior is required because allocation of
      implicitly specified gigantic pages must be done when the
      default_hugepagesz= is encountered for gigantic pages.  Previously the
      code waited until later in the boot process (hugetlb_init), to allocate
      pages of default size.  However the bootmem allocator required for
      gigantic allocations is not available at this time.
      Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: NSandipan Das <sandipan@linux.ibm.com>
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[s390]
      Acked-by: NWill Deacon <will@kernel.org>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Longpeng <longpeng2@huawei.com>
      Cc: Mina Almasry <almasrymina@google.com>
      Cc: Nitesh Narayan Lal <nitesh@redhat.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Link: http://lkml.kernel.org/r/20200417185049.275845-5-mike.kravetz@oracle.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      282f4214
  3. 02 6月, 2020 1 次提交
  4. 22 5月, 2020 1 次提交
  5. 20 5月, 2020 1 次提交
  6. 16 5月, 2020 1 次提交
  7. 08 5月, 2020 2 次提交
    • P
      rcu: Allow rcutorture to starve grace-period kthread · 55b2dcf5
      Paul E. McKenney 提交于
      This commit provides an rcutorture.stall_gp_kthread module parameter
      to allow rcutorture to starve the grace-period kthread.  This allows
      testing the code that detects such starvation.
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      55b2dcf5
    • P
      rcutorture: Add flag to produce non-busy-wait task stalls · 19a8ff95
      Paul E. McKenney 提交于
      This commit aids testing of RCU task stall warning messages by adding
      an rcutorture.stall_cpu_block module parameter that results in the
      induced stall sleeping within the RCU read-side critical section.
      Spinning with interrupts disabled is still available via the
      rcutorture.stall_cpu_irqsoff module parameter, and specifying neither
      of these two module parameters will spin with preemption disabled.
      
      Note that sleeping (as opposed to preemption) results in additional
      complaints from RCU at context-switch time, so yet more testing.
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      19a8ff95
  8. 01 5月, 2020 1 次提交
  9. 29 4月, 2020 3 次提交
  10. 28 4月, 2020 2 次提交
  11. 27 4月, 2020 1 次提交
    • R
      x86/setup: Add an initrdmem= option to specify initrd physical address · 694cfd87
      Ronald G. Minnich 提交于
      Add the initrdmem option:
      
        initrdmem=ss[KMG],nn[KMG]
      
      which is used to specify the physical address of the initrd, almost
      always an address in FLASH. Also add code for x86 to use the existing
      phys_init_start and phys_init_size variables in the kernel.
      
      This is useful in cases where a kernel and an initrd is placed in FLASH,
      but there is no firmware file system structure in the FLASH.
      
      One such situation occurs when unused FLASH space on UEFI systems has
      been reclaimed by, e.g., taking it from the Management Engine. For
      example, on many systems, the ME is given half the FLASH part; not only
      is 2.75M of an 8M part unused; but 10.75M of a 16M part is unused. This
      space can be used to contain an initrd, but need to tell Linux where it
      is.
      
      This space is "raw": due to, e.g., UEFI limitations: it can not be added
      to UEFI firmware volumes without rebuilding UEFI from source or writing
      a UEFI device driver. It can be referenced only as a physical address
      and size.
      
      At the same time, if a kernel can be "netbooted" or loaded from GRUB or
      syslinux, the option of not using the physical address specification
      should be available.
      
      Then, it is easy to boot the kernel and provide an initrd; or boot the
      the kernel and let it use the initrd in FLASH. In practice, this has
      proven to be very helpful when integrating Linux into FLASH on x86.
      
      Hence, the most flexible and convenient path is to enable the initrdmem
      command line option in a way that it is the last choice tried.
      
      For example, on the DigitalLoggers Atomic Pi, an image into FLASH can be
      burnt in with a built-in command line which includes:
      
        initrdmem=0xff968000,0x200000
      
      which specifies a location and size.
      
       [ bp: Massage commit message, make it passive. ]
      
      [akpm@linux-foundation.org: coding style fixes]
      Signed-off-by: NRonald G. Minnich <rminnich@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NH. Peter Anvin (Intel) <hpa@zytor.com>
      Link: http://lkml.kernel.org/r/CAP6exYLK11rhreX=6QPyDQmW7wPHsKNEFtXE47pjx41xS6O7-A@mail.gmail.com
      Link: https://lkml.kernel.org/r/20200426011021.1cskg0AGd%akpm@linux-foundation.org
      694cfd87
  12. 23 4月, 2020 1 次提交
  13. 20 4月, 2020 1 次提交
    • M
      x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation · 7e5b3c26
      Mark Gross 提交于
      SRBDS is an MDS-like speculative side channel that can leak bits from the
      random number generator (RNG) across cores and threads. New microcode
      serializes the processor access during the execution of RDRAND and
      RDSEED. This ensures that the shared buffer is overwritten before it is
      released for reuse.
      
      While it is present on all affected CPU models, the microcode mitigation
      is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
      cases where TSX is not supported or has been disabled with TSX_CTRL.
      
      The mitigation is activated by default on affected processors and it
      increases latency for RDRAND and RDSEED instructions. Among other
      effects this will reduce throughput from /dev/urandom.
      
      * Enable administrator to configure the mitigation off when desired using
        either mitigations=off or srbds=off.
      
      * Export vulnerability status via sysfs
      
      * Rename file-scoped macros to apply for non-whitelist table initializations.
      
       [ bp: Massage,
         - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g,
         - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in,
         - flip check in cpu_set_bug_bits() to save an indentation level,
         - reflow comments.
         jpoimboe: s/Mitigated/Mitigation/ in user-visible strings
         tglx: Dropped the fused off magic for now
       ]
      Signed-off-by: NMark Gross <mgross@linux.intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Reviewed-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Tested-by: NNeelima Krishnan <neelima.krishnan@intel.com>
      7e5b3c26
  14. 14 4月, 2020 1 次提交
  15. 11 4月, 2020 1 次提交
    • R
      mm: hugetlb: optionally allocate gigantic hugepages using cma · cf11e85f
      Roman Gushchin 提交于
      Commit 944d9fec ("hugetlb: add support for gigantic page allocation
      at runtime") has added the run-time allocation of gigantic pages.
      
      However it actually works only at early stages of the system loading,
      when the majority of memory is free.  After some time the memory gets
      fragmented by non-movable pages, so the chances to find a contiguous 1GB
      block are getting close to zero.  Even dropping caches manually doesn't
      help a lot.
      
      At large scale rebooting servers in order to allocate gigantic hugepages
      is quite expensive and complex.  At the same time keeping some constant
      percentage of memory in reserved hugepages even if the workload isn't
      using it is a big waste: not all workloads can benefit from using 1 GB
      pages.
      
      The following solution can solve the problem:
      1) On boot time a dedicated cma area* is reserved. The size is passed
         as a kernel argument.
      2) Run-time allocations of gigantic hugepages are performed using the
         cma allocator and the dedicated cma area
      
      In this case gigantic hugepages can be allocated successfully with a
      high probability, however the memory isn't completely wasted if nobody
      is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
      etc.
      
      * On a multi-node machine a per-node cma area is allocated on each node.
        Following gigantic hugetlb allocation are using the first available
        numa node if the mask isn't specified by a user.
      
      Usage:
      1) configure the kernel to allocate a cma area for hugetlb allocations:
         pass hugetlb_cma=10G as a kernel argument
      
      2) allocate hugetlb pages as usual, e.g.
         echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
      
      If the option isn't enabled or the allocation of the cma area failed,
      the current behavior of the system is preserved.
      
      x86 and arm-64 are covered by this patch, other architectures can be
      trivially added later.
      
      The patch contains clean-ups and fixes proposed and implemented by Aslan
      Bakirov and Randy Dunlap.  It also contains ideas and suggestions
      proposed by Rik van Riel, Michal Hocko and Mike Kravetz.  Thanks!
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: NAndreas Schaufler <andreas.schaufler@gmx.de>
      Acked-by: NMike Kravetz <mike.kravetz@oracle.com>
      Acked-by: NMichal Hocko <mhocko@kernel.org>
      Cc: Aslan Bakirov <aslan@fb.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cf11e85f
  16. 08 4月, 2020 3 次提交
  17. 02 4月, 2020 1 次提交
    • C
      PM: sleep: Add pm_debug_messages kernel command line option · db96a759
      Chen Yu 提交于
      Debug messages from the system suspend/hibernation infrastructure
      are disabled by default, and can only be enabled after the system
      has boot up via /sys/power/pm_debug_messages.
      
      This makes the hibernation resume hard to track as it involves system
      boot up across hibernation.  There's no chance for software_resume()
      to track the resume process, for example.
      
      Add a kernel command line option to set pm_debug_messages during
      boot up.
      Signed-off-by: NChen Yu <yu.c.chen@intel.com>
      [ rjw: Subject & changelog ]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      db96a759
  18. 01 4月, 2020 1 次提交
  19. 24 3月, 2020 1 次提交
  20. 14 3月, 2020 1 次提交
  21. 11 3月, 2020 1 次提交
  22. 06 3月, 2020 1 次提交
  23. 05 3月, 2020 2 次提交
  24. 03 3月, 2020 2 次提交
  25. 28 2月, 2020 1 次提交
  26. 25 2月, 2020 1 次提交
  27. 21 2月, 2020 3 次提交
    • P
      torture: Allow disabling of boottime CPU-hotplug torture operations · 8171d3e0
      Paul E. McKenney 提交于
      In theory, RCU-hotplug operations are supposed to work as soon as there
      is more than one CPU online.  However, in practice, in normal production
      there is no way to make them happen until userspace is up and running.
      Besides which, on smaller systems, rcutorture doesn't start doing hotplug
      operations until 30 seconds after the start of boot, which on most
      systems also means the better part of 30 seconds after the end of boot.
      This commit therefore provides a new torture.disable_onoff_at_boot kernel
      boot parameter that suppresses CPU-hotplug torture operations until
      about the time that init is spawned.
      
      Of course, if you know of a need for boottime CPU-hotplug operations,
      then you should avoid passing this argument to any of the torture tests.
      You might also want to look at the splats linked to below.
      
      Link: https://lore.kernel.org/lkml/20191206185208.GA25636@paulmck-ThinkPad-P72/Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      8171d3e0
    • P
      rcutorture: Allow boottime stall warnings to be suppressed · 58c53360
      Paul E. McKenney 提交于
      In normal production, an RCU CPU stall warning at boottime is often
      just as bad as at any other time.  In fact, given the desire for fast
      boot, any sort of long-term stall at boot is a bad idea.  However,
      heavy rcutorture testing on large hyperthreaded systems can generate
      boottime RCU CPU stalls as a matter of course.  This commit therefore
      provides a kernel boot parameter that suppresses reporting of boottime
      RCU CPU stall warnings and similarly of rcutorture writer stalls.
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      58c53360
    • P
      rcu: React to callback overload by aggressively seeking quiescent states · b2b00ddf
      Paul E. McKenney 提交于
      In default configutions, RCU currently waits at least 100 milliseconds
      before asking cond_resched() and/or resched_rcu() for help seeking
      quiescent states to end a grace period.  But 100 milliseconds can be
      one good long time during an RCU callback flood, for example, as can
      happen when user processes repeatedly open and close files in a tight
      loop.  These 100-millisecond gaps in successive grace periods during a
      callback flood can result in excessive numbers of callbacks piling up,
      unnecessarily increasing memory footprint.
      
      This commit therefore asks cond_resched() and/or resched_rcu() for help
      as early as the first FQS scan when at least one of the CPUs has more
      than 20,000 callbacks queued, a number that can be changed using the new
      rcutree.qovld kernel boot parameter.  An auxiliary qovld_calc variable
      is used to avoid acquisition of locks that have not yet been initialized.
      Early tests indicate that this reduces the RCU-callback memory footprint
      during rcutorture floods by from 50% to 4x, depending on configuration.
      Reported-by: NJoel Fernandes (Google) <joel@joelfernandes.org>
      Reported-by: NTejun Heo <tj@kernel.org>
      [ paulmck: Fix bug located by Qian Cai. ]
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      Tested-by: NDexuan Cui <decui@microsoft.com>
      Tested-by: NQian Cai <cai@lca.pw>
      b2b00ddf