提交 · 2f5ab6225fa2bd1435a65e9a2c5503e5b5cf4b45 · openeuler / Kernel

19 2月, 2020 7 次提交

docs: sysctl/kernel: remove rtsig entries · 8f21f54b

由 Stephen Kitt 提交于 2月 18, 2020

These have no corresponding code in the kernel.
Signed-off-by: NStephen Kitt <steve@sk2.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

8f21f54b

docs: document panic fully in sysctl/kernel.rst · 404347e6

由 Stephen Kitt 提交于 2月 18, 2020

The description of panic doesn’t cover all the supported scenarios;
this patch fixes that, describing the three possibilities (no reboot,
immediate reboot, reboot after a delay).

Based on the implementation in kernel/panic.c.
Signed-off-by: NStephen Kitt <steve@sk2.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

404347e6

docs: document stop-a in sysctl/kernel.rst · a1ad4f15

由 Stephen Kitt 提交于 2月 18, 2020

This describes the SPARC-specific stop-a sysctl entry, which was
previously listed in kernel.rst but not documented.

Base on the implementation in arch/sparc/kernel/setup_{32,64}.c and
kernel/panic.c.
Signed-off-by: NStephen Kitt <steve@sk2.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

a1ad4f15

docs: add missing IPC documentation in sysctl/kernel.rst · fa5b5264

由 Stephen Kitt 提交于 2月 18, 2020

This adds short descriptions of msgmax, msgmnb, msgmni, and shmmni,
which were previously listed in kernel.rst but not described.
Signed-off-by: NStephen Kitt <steve@sk2.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

fa5b5264

docs: drop l2cr from sysctl/kernel.rst · a474105b

由 Stephen Kitt 提交于 2月 18, 2020

The l2cr sysctl entry was removed in commit c2f3dabe ("sysctl:
kill binary sysctl KERN_PPC_L2CR"), this removes the corresponding
documentation.
Signed-off-by: NStephen Kitt <steve@sk2.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

a474105b

docs: merge debugging-modules.txt into sysctl/kernel.rst · 0317c537

由 Stephen Kitt 提交于 2月 18, 2020

This fits nicely in sysctl/kernel.rst, merge it (and rephrase it)
instead of linking to it.
Signed-off-by: NStephen Kitt <steve@sk2.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

0317c537

docs: pretty up sysctl/kernel.rst · a3cb66a5

由 Stephen Kitt 提交于 2月 18, 2020

This updates sysctl/kernel.rst to use ReStructured Text more fully:
* the list of files is now the table of contents (old entries with no
  corresponding sections are added as empty sections for now);
* code references and commands are formatted as code, except for
  function names which end up linked to the appropriate documentation;
* links are used to point to other documentation and other sections;
* tables are used to make lists of values more readable (as already
  done for some sections);
* in heavily-reworked paragraphs, sentences are wrapped individually,
  to make future diffs easier to read.

The first mention of the kernel version is dropped. The second
mention, saying that the document is accurate for 2.2, is preserved
for now; I will update that once the document really is accurate for a
current kernel release.
Signed-off-by: NStephen Kitt <steve@sk2.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

a3cb66a5

14 2月, 2020 1 次提交

docs: admin-guide: Add description of %c corename format · 895f2c20

由 d.hatayama@fujitsu.com 提交于 2月 13, 2020

There is somehow no description of %c corename format specifier for
/proc/sys/kernel/core_pattern. The %c corename format specifier is
used by user-space application such as systemd-coredump, so it should
be documented.

To find where %c is handled in the kernel source code, look at
function format_corename() in fs/coredump.c.
Signed-off-by: NHATAYAMA Daisuke <d.hatayama@fujitsu.com>
Link: https://lore.kernel.org/r/TYAPR01MB4014714BB2ACE425BB6EC6B7951A0@TYAPR01MB4014.jpnprd01.prod.outlook.comSigned-off-by: NJonathan Corbet <corbet@lwn.net>

895f2c20

08 11月, 2019 2 次提交

docs: admin-guide: Remove threads-max auto-tuning · e80d8938

由 Masanari Iida 提交于 11月 01, 2019

Since following path was merged in 5.4-rc3,
auto-tuning feature in threads-max does not exist any more.
Fix the admin-guide document as is.

kernel/sysctl.c: do not override max_threads provided by userspace
b0f53dbc

Fixes: b0f53dbc ("kernel/sysctl.c: do not override max_threads provided by userspace")
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

e80d8938

docs: admin-guide: Fix min value of threads-max in kernel.rst · 73eb802a

由 Masanari Iida 提交于 11月 01, 2019

Since following patch was merged 5.4-rc3, minimum value for
threads-max changed to 1.

kernel/sysctl.c: do not override max_threads provided by userspace
b0f53dbc

Fixes: b0f53dbc ("kernel/sysctl.c: do not override max_threads provided by userspace")
Signed-off-by: NMasanari Iida <standby24x7@gmail.com>
Acked-by: NMichal Hocko <mhocko@suse.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

73eb802a

11 10月, 2019 1 次提交

docs: admin-guide: fix printk_ratelimit explanation · ca30ad85

由 Oleksandr Natalenko 提交于 10月 02, 2019

The printk_ratelimit value accepts seconds, not jiffies (though it is
converted into jiffies internally). Update documentation to reflect
this.

Also, remove the statement about allowing 1 message in 5 seconds since
bursts up to 10 messages are allowed by default.

Finally, while we are here, mention default value for
printk_ratelimit_burst too.
Signed-off-by: NOleksandr Natalenko <oleksandr@redhat.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

ca30ad85

15 7月, 2019 3 次提交

docs: admin-guide: add a series of orphaned documents · 4f4cfa6c

由 Mauro Carvalho Chehab 提交于 6月 27, 2019

There are lots of documents that belong to the admin-guide but
are on random places (most under Documentation root dir).

Move them to the admin guide.
Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: NAlexandre Belloni <alexandre.belloni@bootlin.com>
Acked-by: NBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>

4f4cfa6c

docs: admin-guide: move sysctl directory to it · 57043247

由 Mauro Carvalho Chehab 提交于 4月 22, 2019

The stuff under sysctl describes /sys interface from userspace
point of view. So, add it to the admin-guide and remove the
:orphan: from its index file.
Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>

57043247

docs: sysctl: convert to ReST · 53b95375

由 Mauro Carvalho Chehab 提交于 4月 18, 2019

Rename the /proc/sys/ documentation files to ReST, using the
README file as a template for an index.rst, adding the other
files there via TOC markup.

Despite being written on different times with different
styles, try to make them somewhat coherent with a similar
look and feel, ensuring that they'll look nice as both
raw text file and as via the html output produced by the
Sphinx build system.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>

53b95375

19 6月, 2019 1 次提交

s390/sclp: remove call home support · 191fa92b

由 Heiko Carstens 提交于 6月 17, 2019

This feature has never been used, so remove it.
Acked-by: NVasily Gorbik <gor@linux.ibm.com>
Acked-by: NHendrik Brueckner <brueckner@linux.ibm.com>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NVasily Gorbik <gor@linux.ibm.com>

191fa92b

09 6月, 2019 1 次提交

docs: fix broken documentation links · cb1aaebe

由 Mauro Carvalho Chehab 提交于 6月 07, 2019

Mostly due to x86 and acpi conversion, several documentation
links are still pointing to the old file. Fix them.
Signed-off-by: NMauro Carvalho Chehab <mchehab+samsung@kernel.org>
Reviewed-by: NWolfram Sang <wsa@the-dreams.de>
Reviewed-by: NSven Van Asbroeck <TheSven73@gmail.com>
Reviewed-by: NBhupesh Sharma <bhsharma@redhat.com>
Acked-by: NMark Brown <broonie@kernel.org>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

cb1aaebe

25 3月, 2019 1 次提交

Documentation: fix core_pattern max length · cc809ed8

由 Jakub Wilk 提交于 3月 18, 2019

The buffer size for core_pattern is 128, but one character is used for
terminating null byte, so the actual limit is 127:

    # printf '%0999d' > /proc/sys/kernel/core_pattern
    # tr -d '\n' < /proc/sys/kernel/core_pattern | wc -c
    127
Signed-off-by: NJakub Wilk <jwilk@jwilk.net>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

cc809ed8

27 1月, 2019 1 次提交

sched/topology: Introduce a sysctl for Energy Aware Scheduling · 8d5d0cfb

由 Quentin Perret 提交于 12月 03, 2018

In its current state, Energy Aware Scheduling (EAS) starts automatically
on asymmetric platforms having an Energy Model (EM). However, there are
users who want to have an EM (for thermal management for example), but
don't want EAS with it.

In order to let users disable EAS explicitly, introduce a new sysctl
called 'sched_energy_aware'. It is enabled by default so that EAS can
start automatically on platforms where it makes sense. Flipping it to 0
rebuilds the scheduling domains and disables EAS.
Signed-off-by: NQuentin Perret <quentin.perret@arm.com>
Signed-off-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: adharmap@codeaurora.org
Cc: chris.redpath@arm.com
Cc: currojerez@riseup.net
Cc: dietmar.eggemann@arm.com
Cc: edubezval@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: javi.merino@kernel.org
Cc: joel@joelfernandes.org
Cc: juri.lelli@redhat.com
Cc: morten.rasmussen@arm.com
Cc: patrick.bellasi@arm.com
Cc: pkondeti@codeaurora.org
Cc: rjw@rjwysocki.net
Cc: skannan@codeaurora.org
Cc: smuckle@google.com
Cc: srinivas.pandruvada@linux.intel.com
Cc: thara.gopinath@linaro.org
Cc: tkjos@google.com
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Cc: viresh.kumar@linaro.org
Link: https://lkml.kernel.org/r/20181203095628.11858-11-quentin.perret@arm.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

8d5d0cfb

09 1月, 2019 1 次提交

docs: Revamp tainted-kernels.rst to make it more comprehensible · 896dd323

由 Thorsten Leemhuis 提交于 1月 08, 2019

Add a section about decoding /proc/sys/kernel/tainted, create a more
understandable intro and a hopefully explain better the tainted flags in
bugs, oops or panics messages. Only thing missing then is a table that
quickly describes the various bits and taint flags before going into more
detail, so add that as well.

That table is partly based on a section from Documentation/sysctl/kernel.txt,
but a bit more compact. To avoid confusion I added the shortened version to
kernel.txt; the same table is used in three different places now:
./tools/debugging/kernel-chktaint,
Documentation/admin-guide/tainted-kernels.rst and
Documentation/sysctl/kernel.txt

During review of v1 (see above) a number of existing issues with the text
were raised, like outdated usages as well as incomplete or missing
descriptions.  Address most of those as well.
Signed-off-by: NThorsten Leemhuis <linux@leemhuis.info>
[jc: tightened up changelog]
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

896dd323

05 1月, 2019 1 次提交

kernel/sysctl: add panic_print into sysctl · 81c9d43f

由 Feng Tang 提交于 1月 03, 2019

So that we can also runtime chose to print out the needed system info
for panic, other than setting the kernel cmdline.

Link: http://lkml.kernel.org/r/1543398842-19295-3-git-send-email-feng.tang@intel.comSigned-off-by: NFeng Tang <feng.tang@intel.com>
Suggested-by: NSteven Rostedt <rostedt@goodmis.org>
Acked-by: NSteven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

81c9d43f

05 9月, 2018 1 次提交

stackleak: Allow runtime disabling of kernel stack erasing · 964c9dff

由 Alexander Popov 提交于 8月 17, 2018

Introduce CONFIG_STACKLEAK_RUNTIME_DISABLE option, which provides
'stack_erasing' sysctl. It can be used in runtime to control kernel
stack erasing for kernels built with CONFIG_GCC_PLUGIN_STACKLEAK.
Suggested-by: NIngo Molnar <mingo@kernel.org>
Signed-off-by: NAlexander Popov <alex.popov@linux.com>
Tested-by: NLaura Abbott <labbott@redhat.com>
Signed-off-by: NKees Cook <keescook@chromium.org>

964c9dff

23 8月, 2018 2 次提交

ipc: reorganize initialization of kern_ipc_perm.seq · e2652ae6

由 Manfred Spraul 提交于 8月 21, 2018

ipc_addid() initializes kern_ipc_perm.seq after having called idr_alloc()
(within ipc_idr_alloc()).

Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_check()
may see an uninitialized value.

The patch moves the initialization of kern_ipc_perm.seq before the calls
of idr_alloc().

Notes:
1) This patch has a user space visible side effect:
If /proc/sys/kernel/*_next_id is used (i.e.: checkpoint/restore) and
if semget()/msgget()/shmget() fails in the final step of adding the id
to the rhash tree, then .._next_id is cleared. Before the patch, is
remained unmodified.

There is no change of the behavior after a successful ..get() call: It
always clears .._next_id, there is no impact to non checkpoint/restore
code as that code does not use .._next_id.

2) The patch correctly documents that after a call to ipc_idr_alloc(),
the full tear-down sequence must be used. The callers of ipc_addid()
do not fullfill that, i.e. more bugfixes are required.

The patch is a squash of a patch from Dmitry and my own changes.

Link: http://lkml.kernel.org/r/20180712185241.4017-3-manfred@colorfullife.com
Reported-by: syzbot+2827ef6b3385deb07eaf@syzkaller.appspotmail.com
Signed-off-by: NManfred Spraul <manfred@colorfullife.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e2652ae6

kernel/hung_task.c: allow to set checking interval separately from timeout · a2e51445

由 Dmitry Vyukov 提交于 8月 21, 2018

Currently task hung checking interval is equal to timeout, as the result
hung is detected anywhere between timeout and 2*timeout.  This is fine for
most interactive environments, but this hurts automated testing setups
(syzbot).  In an automated setup we need to strictly order CPU lockup <
RCU stall < workqueue lockup < task hung < silent loss, so that RCU stall
is not detected as task hung and task hung is not detected as silent
machine loss.  The large variance in task hung detection timeout requires
setting silent machine loss timeout to a very large value (e.g.  if task
hung is 3 mins, then silent loss need to be set to ~7 mins).  The
additional 3 minutes significantly reduce testing efficiency because
usually we crash kernel within a minute, and this can add hours to bug
localization process as it needs to do dozens of tests.

Allow setting checking interval separately from timeout.  This allows to
set timeout to, say, 3 minutes, but checking interval to 10 secs.

The interval is controlled via a new hung_task_check_interval_secs sysctl,
similar to the existing hung_task_timeout_secs sysctl.  The default value
of 0 results in the current behavior: checking interval is equal to
timeout.

[akpm@linux-foundation.org: update hung_task_timeout_max's comment]
Link: http://lkml.kernel.org/r/20180611111004.203513-1-dvyukov@google.comSigned-off-by: NDmitry Vyukov <dvyukov@google.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a2e51445

08 7月, 2018 1 次提交

Drivers: HV: Send one page worth of kmsg dump over Hyper-V during panic · 81b18bce

由 Sunil Muthuswamy 提交于 7月 08, 2018

In the VM mode on Hyper-V, currently, when the kernel panics, an error
code and few register values are populated in an MSR and the Hypervisor
notified. This information is collected on the host. The amount of
information currently collected is found to be limited and not very
actionable. To gather more actionable data, such as stack trace, the
proposal is to write one page worth of kmsg data on an allocated page
and the Hypervisor notified of the page address through the MSR.

- Sysctl option to control the behavior, with ON by default.

Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: NSunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: NK. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

81b18bce

12 4月, 2018 2 次提交

taint: add taint for randstruct · bc4f2f54

由 Kees Cook 提交于 4月 10, 2018

Since the randstruct plugin can intentionally produce extremely unusual
kernel structure layouts (even performance pathological ones), some
maintainers want to be able to trivially determine if an Oops is coming
from a randstruct-built kernel, so as to keep their sanity when
debugging.  This adds the new flag and initializes taint_mask
immediately when built with randstruct.

Link: http://lkml.kernel.org/r/1519084390-43867-4-git-send-email-keescook@chromium.orgSigned-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bc4f2f54

taint: consolidate documentation · 9c4560e5

由 Kees Cook 提交于 4月 10, 2018

This consolidates the taint bit documentation into a single place with
both numeric and letter values.  Additionally adds the missing TAINT_AUX
documentation.

Link: http://lkml.kernel.org/r/1519084390-43867-3-git-send-email-keescook@chromium.orgSigned-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NAndrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

9c4560e5

22 12月, 2017 1 次提交

doc: update kptr_restrict documentation · da271403

由 Tobin C. Harding 提交于 12月 20, 2017

Recently the behaviour of printk specifier %pK was changed. The
documentation does not currently mirror this.

Update documentation for sysctl kptr_restrict.
Signed-off-by: NTobin C. Harding <me@tobin.cc>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

da271403

12 12月, 2017 1 次提交

Documentation: Better document the hardlockup_panic sysctl · d22881dc

由 Scott Wood 提交于 12月 10, 2017

Commit ac1f5912 ("kernel/watchdog.c: add sysctl knob
hardlockup_panic") added the hardlockup_panic sysctl, but did not add it
to Documentation/sysctl/kernel.txt.  Add this, and reference it from the
corresponding entry in Documentation/admin-guide/kernel-parameters.txt.
Signed-off-by: NScott Wood <swood@redhat.com>
Acked-by: NChristoph von Recklinghausen <crecklin@redhat.com>
Acked-by: NDon Zickus <dzickus@redhat.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

d22881dc

23 8月, 2017 1 次提交

perf: Fix documentation for sysctls perf_event_paranoid and perf_event_mlock_kb · ac0bb6b7

由 Konstantin Khlebnikov 提交于 8月 20, 2017

Fix misprint CAP_IOC_LOCK -> CAP_IPC_LOCK. This capability have nothing
to do with raw tracepoints. This part is about bypassing mlock limits.

Sysctl kernel.perf_event_paranoid = -1 allows raw and ftrace function
tracepoints without CAP_SYS_ADMIN.
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/150322916080.129746.11285255474738558340.stgit@buzzSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

ac0bb6b7

15 8月, 2017 1 次提交

seccomp: Sysctl to display available actions · 8e5f1ad1

由 Tyler Hicks 提交于 8月 11, 2017

This patch creates a read-only sysctl containing an ordered list of
seccomp actions that the kernel supports. The ordering, from left to
right, is the lowest action value (kill) to the highest action value
(allow). Currently, a read of the sysctl file would return "kill trap
errno trace allow". The contents of this sysctl file can be useful for
userspace code as well as the system administrator.

The path to the sysctl is:

  /proc/sys/kernel/seccomp/actions_avail

libseccomp and other userspace code can easily determine which actions
the current kernel supports. The set of actions supported by the current
kernel may be different than the set of action macros found in kernel
headers that were installed where the userspace code was built.

In addition, this sysctl will allow system administrators to know which
actions are supported by the kernel and make it easier to configure
exactly what seccomp logs through the audit subsystem. Support for this
level of logging configuration will come in a future patch.
Signed-off-by: NTyler Hicks <tyhicks@canonical.com>
Signed-off-by: NKees Cook <keescook@chromium.org>

8e5f1ad1

04 3月, 2017 1 次提交

Documentation: Update path to sysrq.txt · d3c1a297

由 Krzysztof Kozlowski 提交于 2月 24, 2017

Commit 9d85025b ("docs-rst: create an user's manual book") moved the
sysrq.txt leaving old paths in the kernel docs.
Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: NJonathan Corbet <corbet@lwn.net>

d3c1a297

26 10月, 2016 1 次提交

x86/dumpstack: Remove raw stack dump · 0ee1dd9f

由 Josh Poimboeuf 提交于 10月 25, 2016

For mostly historical reasons, the x86 oops dump shows the raw stack
values:

  ...
  [registers]
  Stack:
   ffff880079af7350 ffff880079905400 0000000000000000 ffffc900008f3ae0
   ffffffffa0196610 0000000000000001 00010000ffffffff 0000000087654321
   0000000000000002 0000000000000000 0000000000000000 0000000000000000
  Call Trace:
  ...

This seems to be an artifact from long ago, and probably isn't needed
anymore.  It generally just adds noise to the dump, and it can be
actively harmful because it leaks kernel addresses.

Linus says:

  "The stack dump actually goes back to forever, and it used to be
   useful back in 1992 or so. But it used to be useful mainly because
   stacks were simpler and we didn't have very good call traces anyway. I
   definitely remember having used them - I just do not remember having
   used them in the last ten+ years.

   Of course, it's still true that if you can trigger an oops, you've
   likely already lost the security game, but since the stack dump is so
   useless, let's aim to just remove it and make games like the above
   harder."

This also removes the related 'kstack=' cmdline option and the
'kstack_depth_to_print' sysctl.
Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e83bd50df52d8fe88e94d2566426ae40d813bf8f.1477405374.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

0ee1dd9f

24 10月, 2016 1 次提交

docs: fix locations of several documents that got moved · 8c27ceff

由 Mauro Carvalho Chehab 提交于 10月 18, 2016

The previous patch renamed several files that are cross-referenced
along the Kernel documentation. Adjust the links to point to
the right places.
Signed-off-by: NMauro Carvalho Chehab <mchehab@s-opensource.com>

8c27ceff

03 8月, 2016 1 次提交

printk: add kernel parameter to control writes to /dev/kmsg · 750afe7b

由 Borislav Petkov 提交于 8月 02, 2016

Add a "printk.devkmsg" kernel command line parameter which controls how
userspace writes into /dev/kmsg.  It has three options:

 * ratelimit - ratelimit logging from userspace.
 * on  - unlimited logging from userspace
 * off - logging from userspace gets ignored

The default setting is to ratelimit the messages written to it.

This changes the kernel default setting of "on" to "ratelimit" and we do
that because we want to keep userspace spamming /dev/kmsg to sane
levels.  This is especially moot when a small kernel log buffer wraps
around and messages get lost.  So the ratelimiting setting should be a
sane setting where kernel messages should have a bit higher chance of
survival from all the spamming.

It additionally does not limit logging to /dev/kmsg while the system is
booting if we haven't disabled it on the command line.

Furthermore, we can control the logging from a lower priority sysctl
interface - kernel.printk_devkmsg.

That interface will succeed only if printk.devkmsg *hasn't* been
supplied on the command line.  If it has, then printk.devkmsg is a
one-time setting which remains for the duration of the system lifetime.
This "locking" of the setting is to prevent userspace from changing the
logging on us through sysctl(2).

This patch is based on previous patches from Linus and Steven.

[bp@suse.de: fixes]
  Link: http://lkml.kernel.org/r/20160719072344.GC25563@nazgul.tnic
Link: http://lkml.kernel.org/r/20160716061745.15795-3-bp@alien8.deSigned-off-by: NBorislav Petkov <bp@suse.de>
Cc: Dave Young <dyoung@redhat.com>
Cc: Franck Bui <fbui@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

750afe7b

16 6月, 2016 1 次提交

rcu: sysctl: Panic on RCU Stall · 088e9d25

由 Daniel Bristot de Oliveira 提交于 6月 02, 2016

It is not always easy to determine the cause of an RCU stall just by
analysing the RCU stall messages, mainly when the problem is caused
by the indirect starvation of rcu threads. For example, when preempt_rcu
is not awakened due to the starvation of a timer softirq.

We have been hard coding panic() in the RCU stall functions for
some time while testing the kernel-rt. But this is not possible in
some scenarios, like when supporting customers.

This patch implements the sysctl kernel.panic_on_rcu_stall. If
set to 1, the system will panic() when an RCU stall takes place,
enabling the capture of a vmcore. The vmcore provides a way to analyze
all kernel/tasks states, helping out to point to the culprit and the
solution for the stall.

The kernel.panic_on_rcu_stall sysctl is disabled by default.

Changes from v1:
- Fixed a typo in the git log
- The if(sysctl_panic_on_rcu_stall) panic() is in a static function
- Fixed the CONFIG_TINY_RCU compilation issue
- The var sysctl_panic_on_rcu_stall is now __read_mostly

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Acked-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Reviewed-by: NArnaldo Carvalho de Melo <acme@kernel.org>
Tested-by: N"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>

088e9d25

17 5月, 2016 1 次提交

perf core: Separate accounting of contexts and real addresses in a stack trace · c85b0334

由 Arnaldo Carvalho de Melo 提交于 5月 12, 2016

The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.

So allocate a bunch of extra entries for contexts, and do the accounting
via perf_callchain_entry_ctx struct members.

A new sysctl, kernel.perf_event_max_contexts_per_stack is also
introduced for investigating possible bugs in the callchain
implementation by some arch.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

c85b0334

10 5月, 2016 1 次提交

perf/core: Change the default paranoia level to 2 · 0161028b

由 Andy Lutomirski 提交于 5月 09, 2016

Allowing unprivileged kernel profiling lets any user dump follow kernel
control flow and dump kernel registers.  This most likely allows trivial
kASLR bypassing, and it may allow other mischief as well.  (Off the top
of my head, the PERF_SAMPLE_REGS_INTR output during /dev/urandom reads
could be quite interesting.)
Signed-off-by: NAndy Lutomirski <luto@kernel.org>
Acked-by: NKees Cook <keescook@chromium.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0161028b

27 4月, 2016 1 次提交

perf core: Allow setting up max frame stack depth via sysctl · c5dfd78e

由 Arnaldo Carvalho de Melo 提交于 4月 21, 2016

The default remains 127, which is good for most cases, and not even hit
most of the time, but then for some cases, as reported by Brendan, 1024+
deep frames are appearing on the radar for things like groovy, ruby.

And in some workloads putting a _lower_ cap on this may make sense. One
that is per event still needs to be put in place tho.

The new file is:

  # cat /proc/sys/kernel/perf_event_max_stack
  127

Chaging it:

  # echo 256 > /proc/sys/kernel/perf_event_max_stack
  # cat /proc/sys/kernel/perf_event_max_stack
  256

But as soon as there is some event using callchains we get:

  # echo 512 > /proc/sys/kernel/perf_event_max_stack
  -bash: echo: write error: Device or resource busy
  #

Because we only allocate the callchain percpu data structures when there
is a user, which allows for changing the max easily, its just a matter
of having no callchain users at that point.
Reported-and-Tested-by: NBrendan Gregg <brendan.d.gregg@gmail.com>
Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
Acked-by: NAlexei Starovoitov <ast@kernel.org>
Acked-by: NDavid Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.orgSigned-off-by: NArnaldo Carvalho de Melo <acme@redhat.com>

c5dfd78e

08 3月, 2016 1 次提交

TTY, devpts: document pty count limiting · 8b253b07

由 Konstantin Khlebnikov 提交于 2月 21, 2016

Logic has been changed in kernel 3.4 by commit e9aba515
("tty: rework pty count limiting") but still not documented.

Sysctl kernel.pty.max works as global limit, kernel.pty.reserve ptys
are reserved for initial devpts instance (mounted without "newinstance").
Per-instance limit also could be set by mount option "max=%d".
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

8b253b07

09 2月, 2016 1 次提交

sched/debug: Make schedstats a runtime tunable that is disabled by default · cb251765

由 Mel Gorman 提交于 2月 05, 2016

schedstats is very useful during debugging and performance tuning but it
incurs overhead to calculate the stats. As such, even though it can be
disabled at build time, it is often enabled as the information is useful.

This patch adds a kernel command-line and sysctl tunable to enable or
disable schedstats on demand (when it's built in). It is disabled
by default as someone who knows they need it can also learn to enable
it when necessary.

The benefits are dependent on how scheduler-intensive the workload is.
If it is then the patch reduces the number of cycles spent calculating
the stats with a small benefit from reducing the cache footprint of the
scheduler.

These measurements were taken from a 48-core 2-socket
machine with Xeon(R) E5-2670 v3 cpus although they were also tested on a
single socket machine 8-core machine with Intel i7-3770 processors.

netperf-tcp
                           4.5.0-rc1             4.5.0-rc1
                             vanilla          nostats-v3r1
Hmean    64         560.45 (  0.00%)      575.98 (  2.77%)
Hmean    128        766.66 (  0.00%)      795.79 (  3.80%)
Hmean    256        950.51 (  0.00%)      981.50 (  3.26%)
Hmean    1024      1433.25 (  0.00%)     1466.51 (  2.32%)
Hmean    2048      2810.54 (  0.00%)     2879.75 (  2.46%)
Hmean    3312      4618.18 (  0.00%)     4682.09 (  1.38%)
Hmean    4096      5306.42 (  0.00%)     5346.39 (  0.75%)
Hmean    8192     10581.44 (  0.00%)    10698.15 (  1.10%)
Hmean    16384    18857.70 (  0.00%)    18937.61 (  0.42%)

Small gains here, UDP_STREAM showed nothing intresting and neither did
the TCP_RR tests. The gains on the 8-core machine were very similar.

tbench4
                                 4.5.0-rc1             4.5.0-rc1
                                   vanilla          nostats-v3r1
Hmean    mb/sec-1         500.85 (  0.00%)      522.43 (  4.31%)
Hmean    mb/sec-2         984.66 (  0.00%)     1018.19 (  3.41%)
Hmean    mb/sec-4        1827.91 (  0.00%)     1847.78 (  1.09%)
Hmean    mb/sec-8        3561.36 (  0.00%)     3611.28 (  1.40%)
Hmean    mb/sec-16       5824.52 (  0.00%)     5929.03 (  1.79%)
Hmean    mb/sec-32      10943.10 (  0.00%)    10802.83 ( -1.28%)
Hmean    mb/sec-64      15950.81 (  0.00%)    16211.31 (  1.63%)
Hmean    mb/sec-128     15302.17 (  0.00%)    15445.11 (  0.93%)
Hmean    mb/sec-256     14866.18 (  0.00%)    15088.73 (  1.50%)
Hmean    mb/sec-512     15223.31 (  0.00%)    15373.69 (  0.99%)
Hmean    mb/sec-1024    14574.25 (  0.00%)    14598.02 (  0.16%)
Hmean    mb/sec-2048    13569.02 (  0.00%)    13733.86 (  1.21%)
Hmean    mb/sec-3072    12865.98 (  0.00%)    13209.23 (  2.67%)

Small gains of 2-4% at low thread counts and otherwise flat.  The
gains on the 8-core machine were slightly different

tbench4 on 8-core i7-3770 single socket machine
Hmean    mb/sec-1        442.59 (  0.00%)      448.73 (  1.39%)
Hmean    mb/sec-2        796.68 (  0.00%)      794.39 ( -0.29%)
Hmean    mb/sec-4       1322.52 (  0.00%)     1343.66 (  1.60%)
Hmean    mb/sec-8       2611.65 (  0.00%)     2694.86 (  3.19%)
Hmean    mb/sec-16      2537.07 (  0.00%)     2609.34 (  2.85%)
Hmean    mb/sec-32      2506.02 (  0.00%)     2578.18 (  2.88%)
Hmean    mb/sec-64      2511.06 (  0.00%)     2569.16 (  2.31%)
Hmean    mb/sec-128     2313.38 (  0.00%)     2395.50 (  3.55%)
Hmean    mb/sec-256     2110.04 (  0.00%)     2177.45 (  3.19%)
Hmean    mb/sec-512     2072.51 (  0.00%)     2053.97 ( -0.89%)

In constract, this shows a relatively steady 2-3% gain at higher thread
counts. Due to the nature of the patch and the type of workload, it's
not a surprise that the result will depend on the CPU used.

hackbench-pipes
                         4.5.0-rc1             4.5.0-rc1
                           vanilla          nostats-v3r1
Amean    1        0.0637 (  0.00%)      0.0660 ( -3.59%)
Amean    4        0.1229 (  0.00%)      0.1181 (  3.84%)
Amean    7        0.1921 (  0.00%)      0.1911 (  0.52%)
Amean    12       0.3117 (  0.00%)      0.2923 (  6.23%)
Amean    21       0.4050 (  0.00%)      0.3899 (  3.74%)
Amean    30       0.4586 (  0.00%)      0.4433 (  3.33%)
Amean    48       0.5910 (  0.00%)      0.5694 (  3.65%)
Amean    79       0.8663 (  0.00%)      0.8626 (  0.43%)
Amean    110      1.1543 (  0.00%)      1.1517 (  0.22%)
Amean    141      1.4457 (  0.00%)      1.4290 (  1.16%)
Amean    172      1.7090 (  0.00%)      1.6924 (  0.97%)
Amean    192      1.9126 (  0.00%)      1.9089 (  0.19%)

Some small gains and losses and while the variance data is not included,
it's close to the noise. The UMA machine did not show anything particularly
different

pipetest
                             4.5.0-rc1             4.5.0-rc1
                               vanilla          nostats-v2r2
Min         Time        4.13 (  0.00%)        3.99 (  3.39%)
1st-qrtle   Time        4.38 (  0.00%)        4.27 (  2.51%)
2nd-qrtle   Time        4.46 (  0.00%)        4.39 (  1.57%)
3rd-qrtle   Time        4.56 (  0.00%)        4.51 (  1.10%)
Max-90%     Time        4.67 (  0.00%)        4.60 (  1.50%)
Max-93%     Time        4.71 (  0.00%)        4.65 (  1.27%)
Max-95%     Time        4.74 (  0.00%)        4.71 (  0.63%)
Max-99%     Time        4.88 (  0.00%)        4.79 (  1.84%)
Max         Time        4.93 (  0.00%)        4.83 (  2.03%)
Mean        Time        4.48 (  0.00%)        4.39 (  1.91%)
Best99%Mean Time        4.47 (  0.00%)        4.39 (  1.91%)
Best95%Mean Time        4.46 (  0.00%)        4.38 (  1.93%)
Best90%Mean Time        4.45 (  0.00%)        4.36 (  1.98%)
Best50%Mean Time        4.36 (  0.00%)        4.25 (  2.49%)
Best10%Mean Time        4.23 (  0.00%)        4.10 (  3.13%)
Best5%Mean  Time        4.19 (  0.00%)        4.06 (  3.20%)
Best1%Mean  Time        4.13 (  0.00%)        4.00 (  3.39%)

Small improvement and similar gains were seen on the UMA machine.

The gain is small but it stands to reason that doing less work in the
scheduler is a good thing. The downside is that the lack of schedstats and
tracepoints may be surprising to experts doing performance analysis until
they find the existence of the schedstats= parameter or schedstats sysctl.
It will be automatically activated for latencytop and sleep profiling to
alleviate the problem. For tracepoints, there is a simple warning as it's
not safe to activate schedstats in the context when it's known the tracepoint
may be wanted but is unavailable.
Signed-off-by: NMel Gorman <mgorman@techsingularity.net>
Reviewed-by: NMatt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1454663316-22048-1-git-send-email-mgorman@techsingularity.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

cb251765

openeuler / Kernel 接近 2 年 前同步成功

openeuler / Kernel
接近 2 年前同步成功