1. 12 8月, 2020 1 次提交
  2. 08 8月, 2020 1 次提交
  3. 29 7月, 2020 1 次提交
  4. 23 7月, 2020 1 次提交
    • P
      debugfs: Add access restriction option · a24c6f7b
      Peter Enderborg 提交于
      Since debugfs include sensitive information it need to be treated
      carefully. But it also has many very useful debug functions for userspace.
      With this option we can have same configuration for system with
      need of debugfs and a way to turn it off. This gives a extra protection
      for exposure on systems where user-space services with system
      access are attacked.
      
      It is controlled by a configurable default value that can be override
      with a kernel command line parameter. (debugfs=)
      
      It can be on or off, but also internally on but not seen from user-space.
      This no-mount mode do not register a debugfs as filesystem, but client can
      register their parts in the internal structures. This data can be readed
      with a debugger or saved with a crashkernel. When it is off clients
      get EPERM error when accessing the functions for registering their
      components.
      Signed-off-by: NPeter Enderborg <peter.enderborg@sony.com>
      Link: https://lore.kernel.org/r/20200716071511.26864-3-peter.enderborg@sony.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a24c6f7b
  5. 09 7月, 2020 2 次提交
    • Z
      xen: Mark "xen_nopvspin" parameter obsolete · 9a3c05e6
      Zhenzhong Duan 提交于
      Map "xen_nopvspin" to "nopvspin", fix stale description of "xen_nopvspin"
      as we use qspinlock now.
      Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
      Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Stefano Stabellini <sstabellini@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      9a3c05e6
    • Z
      x86/kvm: Add "nopvspin" parameter to disable PV spinlocks · 05eee619
      Zhenzhong Duan 提交于
      There are cases where a guest tries to switch spinlocks to bare metal
      behavior (e.g. by setting "xen_nopvspin" on XEN platform and
      "hv_nopvspin" on HYPER_V).
      
      That feature is missed on KVM, add a new parameter "nopvspin" to disable
      PV spinlocks for KVM guest.
      
      The new 'nopvspin' parameter will also replace Xen and Hyper-V specific
      parameters in future patches.
      
      Define variable nopvsin as global because it will be used in future
      patches as above.
      Signed-off-by: NZhenzhong Duan <zhenzhong.duan@oracle.com>
      Reviewed-by: NVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krcmar <rkrcmar@redhat.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      05eee619
  6. 06 7月, 2020 1 次提交
  7. 02 7月, 2020 1 次提交
    • Q
      cpufreq: Specify default governor on command line · 8412b456
      Quentin Perret 提交于
      Currently, the only way to specify the default CPUfreq governor is
      via Kconfig options, which suits users who can build the kernel
      themselves perfectly.
      
      However, for those who use a distro-like kernel (such as Android,
      with the Generic Kernel Image project), the only way to use a
      non-default governor is to boot to userspace, and to then switch
      using the sysfs interface. Being able to specify the default governor
      on the command line, like is the case for cpuidle, would allow those
      users to specify their governor of choice earlier on, and to simplify
      the userspace boot procedure slighlty.
      
      To support this use-case, add a kernel command line parameter
      allowing the default governor for CPUfreq to be specified, which
      takes precedence over the built-in default.
      
      This implementation has one notable limitation: the default governor
      must be registered before the driver. This is solved for builtin
      governors and drivers using appropriate *_initcall() functions. And
      in the modular case, this must be reflected as a constraint on the
      module loading order.
      Signed-off-by: NQuentin Perret <qperret@google.com>
      [ Viresh: Converted 'default_governor' to a string and parsing it only
      	  at initcall level, and several updates to
      	  cpufreq_init_policy(). ]
      Signed-off-by: NViresh Kumar <viresh.kumar@linaro.org>
      [ rjw: Changelog ]
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      8412b456
  8. 30 6月, 2020 5 次提交
    • P
      torture: Dump ftrace at shutdown only if requested · 2102ad29
      Paul E. McKenney 提交于
      If there is a large number of torture tests running concurrently,
      all of which are dumping large ftrace buffers at shutdown time, the
      resulting dumping can take a very long time, particularly on systems
      with rotating-rust storage.  This commit therefore adds a default-off
      torture.ftrace_dump_at_shutdown module parameter that enables
      shutdown-time ftrace-buffer dumping.
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      2102ad29
    • P
      rcutorture: Add races with task-exit processing · 4a5f133c
      Paul E. McKenney 提交于
      Several variants of Linux-kernel RCU interact with task-exit processing,
      including preemptible RCU, Tasks RCU, and Tasks Trace RCU.  This commit
      therefore adds testing of this interaction to rcutorture by adding
      rcutorture.read_exit_burst and rcutorture.read_exit_delay kernel-boot
      parameters.  These kernel parameters control the frequency and spacing
      of special read-then-exit kthreads that are spawned.
      
      [ paulmck: Apply feedback from Dan Carpenter's static checker. ]
      [ paulmck: Reduce latency to avoid false-positive shutdown hangs. ]
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      4a5f133c
    • P
      refperf: Rename refperf.c to refscale.c and change internal names · 1fbeb3a8
      Paul E. McKenney 提交于
      This commit further avoids conflation of refperf with the kernel's perf
      feature by renaming kernel/rcu/refperf.c to kernel/rcu/refscale.c,
      and also by similarly renaming the functions and variables inside
      this file.  This has the side effect of changing the names of the
      kernel boot parameters, so kernel-parameters.txt and ver_functions.sh
      are also updated.
      
      The rcutorture --torture type remains refperf, and this will be
      addressed in a separate commit.
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      1fbeb3a8
    • P
      doc: Document rcuperf's module parameters · 847dd70a
      Paul E. McKenney 提交于
      This commit adds documentation for the rcuperf module parameters.
      
      Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      847dd70a
    • U
      rcu/tree: cache specified number of objects · 53c72b59
      Uladzislau Rezki (Sony) 提交于
      In order to reduce the dynamic need for pages in kfree_rcu(),
      pre-allocate a configurable number of pages per CPU and link
      them in a list. When kfree_rcu() reclaims objects, the object's
      container page is cached into a list instead of being released
      to the low-level page allocator.
      
      Such an approach provides O(1) access to free pages while also
      reducing the number of requests to the page allocator. It also
      makes the kfree_rcu() code to have free pages available during
      a low memory condition.
      
      A read-only sysfs parameter (rcu_min_cached_objs) reflects the
      minimum number of allowed cached pages per CPU.
      Signed-off-by: NUladzislau Rezki (Sony) <urezki@gmail.com>
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      53c72b59
  9. 20 6月, 2020 1 次提交
  10. 18 6月, 2020 2 次提交
  11. 09 6月, 2020 4 次提交
    • G
      kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases · f117955a
      Guilherme G. Piccoli 提交于
      After a recent change introduced by Vlastimil's series [0], kernel is
      able now to handle sysctl parameters on kernel command line; also, the
      series introduced a simple infrastructure to convert legacy boot
      parameters (that duplicate sysctls) into sysctl aliases.
      
      This patch converts the watchdog parameters softlockup_panic and
      {hard,soft}lockup_all_cpu_backtrace to use the new alias infrastructure.
      It fixes the documentation too, since the alias only accepts values 0 or
      1, not the full range of integers.
      
      We also took the opportunity here to improve the documentation of the
      previously converted hung_task_panic (see the patch series [0]) and put
      the alias table in alphabetical order.
      
      [0] http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.czSigned-off-by: NGuilherme G. Piccoli <gpiccoli@canonical.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Iurii Zaikin <yzaikin@google.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Link: http://lkml.kernel.org/r/20200507214624.21911-1-gpiccoli@canonical.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f117955a
    • V
      kernel/hung_task convert hung_task_panic boot parameter to sysctl · b467f3ef
      Vlastimil Babka 提交于
      We can now handle sysctl parameters on kernel command line and have
      infrastructure to convert legacy command line options that duplicate
      sysctl to become a sysctl alias.
      
      This patch converts the hung_task_panic parameter.  Note that the sysctl
      handler is more strict and allows only 0 and 1, while the legacy
      parameter allowed any non-zero value.  But there is little reason anyone
      would not be using 1.
      Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NKees Cook <keescook@chromium.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com>
      Cc: Iurii Zaikin <yzaikin@google.com>
      Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20200427180433.7029-4-vbabka@suse.czSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b467f3ef
    • V
      kernel/sysctl: support setting sysctl parameters from kernel command line · 3db978d4
      Vlastimil Babka 提交于
      Patch series "support setting sysctl parameters from kernel command line", v3.
      
      This series adds support for something that seems like many people
      always wanted but nobody added it yet, so here's the ability to set
      sysctl parameters via kernel command line options in the form of
      sysctl.vm.something=1
      
      The important part is Patch 1.  The second, not so important part is an
      attempt to clean up legacy one-off parameters that do the same thing as
      a sysctl.  I don't want to remove them completely for compatibility
      reasons, but with generic sysctl support the idea is to remove the
      one-off param handlers and treat the parameters as aliases for the
      sysctl variants.
      
      I have identified several parameters that mention sysctl counterparts in
      Documentation/admin-guide/kernel-parameters.txt but there might be more.
      The conversion also has varying level of success:
      
       - numa_zonelist_order is converted in Patch 2 together with adding the
         necessary infrastructure. It's easy as it doesn't really do anything
         but warn on deprecated value these days.
      
       - hung_task_panic is converted in Patch 3, but there's a downside that
         now it only accepts 0 and 1, while previously it was any integer
         value
      
       - nmi_watchdog maps to two sysctls nmi_watchdog and hardlockup_panic,
         so there's no straighforward conversion possible
      
       - traceoff_on_warning is a flag without value and it would be required
         to handle that somehow in the conversion infractructure, which seems
         pointless for a single flag
      
      This patch (of 5):
      
      A recently proposed patch to add vm_swappiness command line parameter in
      addition to existing sysctl [1] made me wonder why we don't have a
      general support for passing sysctl parameters via command line.
      
      Googling found only somebody else wondering the same [2], but I haven't
      found any prior discussion with reasons why not to do this.
      
      Settings the vm_swappiness issue aside (the underlying issue might be
      solved in a different way), quick search of kernel-parameters.txt shows
      there are already some that exist as both sysctl and kernel parameter -
      hung_task_panic, nmi_watchdog, numa_zonelist_order, traceoff_on_warning.
      
      A general mechanism would remove the need to add more of those one-offs
      and might be handy in situations where configuration by e.g.
      /etc/sysctl.d/ is impractical.
      
      Hence, this patch adds a new parse_args() pass that looks for parameters
      prefixed by 'sysctl.' and tries to interpret them as writes to the
      corresponding sys/ files using an temporary in-kernel procfs mount.
      This mechanism was suggested by Eric W.  Biederman [3], as it handles
      all dynamically registered sysctl tables, even though we don't handle
      modular sysctls.  Errors due to e.g.  invalid parameter name or value
      are reported in the kernel log.
      
      The processing is hooked right before the init process is loaded, as
      some handlers might be more complicated than simple setters and might
      need some subsystems to be initialized.  At the moment the init process
      can be started and eventually execute a process writing to /proc/sys/
      then it should be also fine to do that from the kernel.
      
      Sysctls registered later on module load time are not set by this
      mechanism - it's expected that in such scenarios, setting sysctl values
      from userspace is practical enough.
      
      [1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/
      [2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter
      [3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/Signed-off-by: NVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NLuis Chamberlain <mcgrof@kernel.org>
      Reviewed-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: NKees Cook <keescook@chromium.org>
      Acked-by: NMichal Hocko <mhocko@suse.com>
      Cc: Iurii Zaikin <yzaikin@google.com>
      Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Link: http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz
      Link: http://lkml.kernel.org/r/20200427180433.7029-2-vbabka@suse.czSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      3db978d4
    • R
      kernel: add panic_on_taint · db38d5c1
      Rafael Aquini 提交于
      Analogously to the introduction of panic_on_warn, this patch introduces
      a kernel option named panic_on_taint in order to provide a simple and
      generic way to stop execution and catch a coredump when the kernel gets
      tainted by any given flag.
      
      This is useful for debugging sessions as it avoids having to rebuild the
      kernel to explicitly add calls to panic() into the code sites that
      introduce the taint flags of interest.
      
      For instance, if one is interested in proceeding with a post-mortem
      analysis at the point a given code path is hitting a bad page (i.e.
      unaccount_page_cache_page(), or slab_bug()), a coredump can be collected
      by rebooting the kernel with 'panic_on_taint=0x20' amended to the
      command line.
      
      Another, perhaps less frequent, use for this option would be as a means
      for assuring a security policy case where only a subset of taints, or no
      single taint (in paranoid mode), is allowed for the running system.  The
      optional switch 'nousertaint' is handy in this particular scenario, as
      it will avoid userspace induced crashes by writes to sysctl interface
      /proc/sys/kernel/tainted causing false positive hits for such policies.
      
      [akpm@linux-foundation.org: tweak kernel-parameters.txt wording]
      Suggested-by: NQian Cai <cai@lca.pw>
      Signed-off-by: NRafael Aquini <aquini@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: NLuis Chamberlain <mcgrof@kernel.org>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Adrian Bunk <bunk@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Takashi Iwai <tiwai@suse.de>
      Link: http://lkml.kernel.org/r/20200515175502.146720-1-aquini@redhat.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      db38d5c1
  12. 04 6月, 2020 1 次提交
    • M
      hugetlbfs: clean up command line processing · 282f4214
      Mike Kravetz 提交于
      With all hugetlb page processing done in a single file clean up code.
      
      - Make code match desired semantics
        - Update documentation with semantics
      - Make all warnings and errors messages start with 'HugeTLB:'.
      - Consistently name command line parsing routines.
      - Warn if !hugepages_supported() and command line parameters have
        been specified.
      - Add comments to code
        - Describe some of the subtle interactions
        - Describe semantics of command line arguments
      
      This patch also fixes issues with implicitly setting the number of
      gigantic huge pages to preallocate.  Previously on X86 command line,
      
              hugepages=2 default_hugepagesz=1G
      
      would result in zero 1G pages being preallocated and,
      
              # grep HugePages_Total /proc/meminfo
              HugePages_Total:       0
              # sysctl -a | grep nr_hugepages
              vm.nr_hugepages = 2
              vm.nr_hugepages_mempolicy = 2
              # cat /proc/sys/vm/nr_hugepages
              2
      
      After this patch 2 gigantic pages will be preallocated and all the proc,
      sysfs, sysctl and meminfo files will accurately reflect this.
      
      To address the issue with gigantic pages, a small change in behavior was
      made to command line processing.  Previously the command line,
      
              hugepages=128 default_hugepagesz=2M hugepagesz=2M hugepages=256
      
      would result in the allocation of 256 2M huge pages.  The value 128 would
      be ignored without any warning.  After this patch, 128 2M pages will be
      allocated and a warning message will be displayed indicating the value of
      256 is ignored.  This change in behavior is required because allocation of
      implicitly specified gigantic pages must be done when the
      default_hugepagesz= is encountered for gigantic pages.  Previously the
      code waited until later in the boot process (hugetlb_init), to allocate
      pages of default size.  However the bootmem allocator required for
      gigantic allocations is not available at this time.
      Signed-off-by: NMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: NSandipan Das <sandipan@linux.ibm.com>
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>	[s390]
      Acked-by: NWill Deacon <will@kernel.org>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Longpeng <longpeng2@huawei.com>
      Cc: Mina Almasry <almasrymina@google.com>
      Cc: Nitesh Narayan Lal <nitesh@redhat.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Link: http://lkml.kernel.org/r/20200417185049.275845-5-mike.kravetz@oracle.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      282f4214
  13. 02 6月, 2020 1 次提交
  14. 22 5月, 2020 1 次提交
  15. 20 5月, 2020 1 次提交
  16. 16 5月, 2020 1 次提交
  17. 08 5月, 2020 2 次提交
    • P
      rcu: Allow rcutorture to starve grace-period kthread · 55b2dcf5
      Paul E. McKenney 提交于
      This commit provides an rcutorture.stall_gp_kthread module parameter
      to allow rcutorture to starve the grace-period kthread.  This allows
      testing the code that detects such starvation.
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      55b2dcf5
    • P
      rcutorture: Add flag to produce non-busy-wait task stalls · 19a8ff95
      Paul E. McKenney 提交于
      This commit aids testing of RCU task stall warning messages by adding
      an rcutorture.stall_cpu_block module parameter that results in the
      induced stall sleeping within the RCU read-side critical section.
      Spinning with interrupts disabled is still available via the
      rcutorture.stall_cpu_irqsoff module parameter, and specifying neither
      of these two module parameters will spin with preemption disabled.
      
      Note that sleeping (as opposed to preemption) results in additional
      complaints from RCU at context-switch time, so yet more testing.
      Signed-off-by: NPaul E. McKenney <paulmck@kernel.org>
      19a8ff95
  18. 01 5月, 2020 1 次提交
  19. 29 4月, 2020 3 次提交
  20. 28 4月, 2020 2 次提交
  21. 27 4月, 2020 1 次提交
    • R
      x86/setup: Add an initrdmem= option to specify initrd physical address · 694cfd87
      Ronald G. Minnich 提交于
      Add the initrdmem option:
      
        initrdmem=ss[KMG],nn[KMG]
      
      which is used to specify the physical address of the initrd, almost
      always an address in FLASH. Also add code for x86 to use the existing
      phys_init_start and phys_init_size variables in the kernel.
      
      This is useful in cases where a kernel and an initrd is placed in FLASH,
      but there is no firmware file system structure in the FLASH.
      
      One such situation occurs when unused FLASH space on UEFI systems has
      been reclaimed by, e.g., taking it from the Management Engine. For
      example, on many systems, the ME is given half the FLASH part; not only
      is 2.75M of an 8M part unused; but 10.75M of a 16M part is unused. This
      space can be used to contain an initrd, but need to tell Linux where it
      is.
      
      This space is "raw": due to, e.g., UEFI limitations: it can not be added
      to UEFI firmware volumes without rebuilding UEFI from source or writing
      a UEFI device driver. It can be referenced only as a physical address
      and size.
      
      At the same time, if a kernel can be "netbooted" or loaded from GRUB or
      syslinux, the option of not using the physical address specification
      should be available.
      
      Then, it is easy to boot the kernel and provide an initrd; or boot the
      the kernel and let it use the initrd in FLASH. In practice, this has
      proven to be very helpful when integrating Linux into FLASH on x86.
      
      Hence, the most flexible and convenient path is to enable the initrdmem
      command line option in a way that it is the last choice tried.
      
      For example, on the DigitalLoggers Atomic Pi, an image into FLASH can be
      burnt in with a built-in command line which includes:
      
        initrdmem=0xff968000,0x200000
      
      which specifies a location and size.
      
       [ bp: Massage commit message, make it passive. ]
      
      [akpm@linux-foundation.org: coding style fixes]
      Signed-off-by: NRonald G. Minnich <rminnich@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Reviewed-by: NH. Peter Anvin (Intel) <hpa@zytor.com>
      Link: http://lkml.kernel.org/r/CAP6exYLK11rhreX=6QPyDQmW7wPHsKNEFtXE47pjx41xS6O7-A@mail.gmail.com
      Link: https://lkml.kernel.org/r/20200426011021.1cskg0AGd%akpm@linux-foundation.org
      694cfd87
  22. 23 4月, 2020 1 次提交
  23. 20 4月, 2020 1 次提交
    • M
      x86/speculation: Add Special Register Buffer Data Sampling (SRBDS) mitigation · 7e5b3c26
      Mark Gross 提交于
      SRBDS is an MDS-like speculative side channel that can leak bits from the
      random number generator (RNG) across cores and threads. New microcode
      serializes the processor access during the execution of RDRAND and
      RDSEED. This ensures that the shared buffer is overwritten before it is
      released for reuse.
      
      While it is present on all affected CPU models, the microcode mitigation
      is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
      cases where TSX is not supported or has been disabled with TSX_CTRL.
      
      The mitigation is activated by default on affected processors and it
      increases latency for RDRAND and RDSEED instructions. Among other
      effects this will reduce throughput from /dev/urandom.
      
      * Enable administrator to configure the mitigation off when desired using
        either mitigations=off or srbds=off.
      
      * Export vulnerability status via sysfs
      
      * Rename file-scoped macros to apply for non-whitelist table initializations.
      
       [ bp: Massage,
         - s/VULNBL_INTEL_STEPPING/VULNBL_INTEL_STEPPINGS/g,
         - do not read arch cap MSR a second time in tsx_fused_off() - just pass it in,
         - flip check in cpu_set_bug_bits() to save an indentation level,
         - reflow comments.
         jpoimboe: s/Mitigated/Mitigation/ in user-visible strings
         tglx: Dropped the fused off magic for now
       ]
      Signed-off-by: NMark Gross <mgross@linux.intel.com>
      Signed-off-by: NBorislav Petkov <bp@suse.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: NTony Luck <tony.luck@intel.com>
      Reviewed-by: NPawan Gupta <pawan.kumar.gupta@linux.intel.com>
      Reviewed-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Tested-by: NNeelima Krishnan <neelima.krishnan@intel.com>
      7e5b3c26
  24. 14 4月, 2020 1 次提交
  25. 11 4月, 2020 1 次提交
    • R
      mm: hugetlb: optionally allocate gigantic hugepages using cma · cf11e85f
      Roman Gushchin 提交于
      Commit 944d9fec ("hugetlb: add support for gigantic page allocation
      at runtime") has added the run-time allocation of gigantic pages.
      
      However it actually works only at early stages of the system loading,
      when the majority of memory is free.  After some time the memory gets
      fragmented by non-movable pages, so the chances to find a contiguous 1GB
      block are getting close to zero.  Even dropping caches manually doesn't
      help a lot.
      
      At large scale rebooting servers in order to allocate gigantic hugepages
      is quite expensive and complex.  At the same time keeping some constant
      percentage of memory in reserved hugepages even if the workload isn't
      using it is a big waste: not all workloads can benefit from using 1 GB
      pages.
      
      The following solution can solve the problem:
      1) On boot time a dedicated cma area* is reserved. The size is passed
         as a kernel argument.
      2) Run-time allocations of gigantic hugepages are performed using the
         cma allocator and the dedicated cma area
      
      In this case gigantic hugepages can be allocated successfully with a
      high probability, however the memory isn't completely wasted if nobody
      is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
      etc.
      
      * On a multi-node machine a per-node cma area is allocated on each node.
        Following gigantic hugetlb allocation are using the first available
        numa node if the mask isn't specified by a user.
      
      Usage:
      1) configure the kernel to allocate a cma area for hugetlb allocations:
         pass hugetlb_cma=10G as a kernel argument
      
      2) allocate hugetlb pages as usual, e.g.
         echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
      
      If the option isn't enabled or the allocation of the cma area failed,
      the current behavior of the system is preserved.
      
      x86 and arm-64 are covered by this patch, other architectures can be
      trivially added later.
      
      The patch contains clean-ups and fixes proposed and implemented by Aslan
      Bakirov and Randy Dunlap.  It also contains ideas and suggestions
      proposed by Rik van Riel, Michal Hocko and Mike Kravetz.  Thanks!
      Signed-off-by: NRoman Gushchin <guro@fb.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Tested-by: NAndreas Schaufler <andreas.schaufler@gmx.de>
      Acked-by: NMike Kravetz <mike.kravetz@oracle.com>
      Acked-by: NMichal Hocko <mhocko@kernel.org>
      Cc: Aslan Bakirov <aslan@fb.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Joonsoo Kim <js1304@gmail.com>
      Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.comSigned-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      cf11e85f
  26. 08 4月, 2020 2 次提交