kernel.txt 15.6 KB
Newer Older
L
Linus Torvalds 已提交
1 2
Documentation for /proc/sys/kernel/*	kernel version 2.2.10
	(c) 1998, 1999,  Rik van Riel <riel@nl.linux.org>
3
	(c) 2009,        Shen Feng<shen@cn.fujitsu.com>
L
Linus Torvalds 已提交
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

For general info and legal blurb, please look in README.

==============================================================

This file contains documentation for the sysctl files in
/proc/sys/kernel/ and is valid for Linux kernel version 2.2.

The files in this directory can be used to tune and monitor
miscellaneous and general things in the operation of the Linux
kernel. Since some of the files _can_ be used to screw up your
system, it is advisable to read both documentation and source
before actually making adjustments.

Currently, these files might (depending on your configuration)
show up in /proc/sys/kernel:
20
- acpi_video_flags
L
Linus Torvalds 已提交
21
- acct
22
- callhome		     [ S390 only ]
23
- auto_msgmni
L
Linus Torvalds 已提交
24 25 26 27 28 29 30 31 32
- core_pattern
- core_uses_pid
- ctrl-alt-del
- dentry-state
- domainname
- hostname
- hotplug
- java-appletviewer           [ binfmt_java, obsolete ]
- java-interpreter            [ binfmt_java, obsolete ]
33
- kstack_depth_to_print       [ X86 only ]
L
Linus Torvalds 已提交
34
- l2cr                        [ PPC only ]
35
- modprobe                    ==> Documentation/debugging-modules.txt
36
- modules_disabled
L
Linus Torvalds 已提交
37 38 39
- msgmax
- msgmnb
- msgmni
40
- nmi_watchdog
L
Linus Torvalds 已提交
41 42 43 44 45 46 47
- osrelease
- ostype
- overflowgid
- overflowuid
- panic
- pid_max
- powersave-nap               [ PPC only ]
48
- panic_on_unrecovered_nmi
L
Linus Torvalds 已提交
49
- printk
50
- randomize_va_space
L
Linus Torvalds 已提交
51 52 53 54 55 56 57 58 59 60 61 62 63
- real-root-dev               ==> Documentation/initrd.txt
- reboot-cmd                  [ SPARC only ]
- rtsig-max
- rtsig-nr
- sem
- sg-big-buff                 [ generic SCSI device (sg) ]
- shmall
- shmmax                      [ sysv ipc ]
- shmmni
- stop-a                      [ SPARC only ]
- sysrq                       ==> Documentation/sysrq.txt
- tainted
- threads-max
64
- unknown_nmi_panic
L
Linus Torvalds 已提交
65 66 67 68
- version

==============================================================

69 70 71 72 73 74 75 76 77
acpi_video_flags:

flags

See Doc*/kernel/power/video.txt, it allows mode of video boot to be
set during run time.

==============================================================

L
Linus Torvalds 已提交
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94
acct:

highwater lowwater frequency

If BSD-style process accounting is enabled these values control
its behaviour. If free space on filesystem where the log lives
goes below <lowwater>% accounting suspends. If free space gets
above <highwater>% accounting resumes. <Frequency> determines
how often do we check the amount of free space (value is in
seconds). Default:
4 2 30
That is, suspend accounting if there left <= 2% free; resume it
if we got >=4%; consider information about amount of free space
valid for 30 seconds.

==============================================================

95 96 97 98 99 100 101 102 103 104 105 106 107 108 109
callhome:

Controls the kernel's callhome behavior in case of a kernel panic.

The s390 hardware allows an operating system to send a notification
to a service organization (callhome) in case of an operating system panic.

When the value in this file is 0 (which is the default behavior)
nothing happens in case of a kernel panic. If this value is set to "1"
the complete kernel oops message is send to the IBM customer service
organization in case the mainframe the Linux operating system is running
on has a service contract with IBM.

==============================================================

L
Linus Torvalds 已提交
110 111 112
core_pattern:

core_pattern is used to specify a core dumpfile pattern name.
113
. max length 128 characters; default value is "core"
L
Linus Torvalds 已提交
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131
. core_pattern is used as a pattern template for the output filename;
  certain string patterns (beginning with '%') are substituted with
  their actual values.
. backward compatibility with core_uses_pid:
	If core_pattern does not include "%p" (default does not)
	and core_uses_pid is set, then .PID will be appended to
	the filename.
. corename format specifiers:
	%<NUL>	'%' is dropped
	%%	output one '%'
	%p	pid
	%u	uid
	%g	gid
	%s	signal number
	%t	UNIX time of dump
	%h	hostname
	%e	executable filename
	%<OTHER> both are dropped
132 133 134
. If the first character of the pattern is a '|', the kernel will treat
  the rest of the pattern as a command to run.  The core dump will be
  written to the standard input of that program instead of to a file.
L
Linus Torvalds 已提交
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196

==============================================================

core_uses_pid:

The default coredump filename is "core".  By setting
core_uses_pid to 1, the coredump filename becomes core.PID.
If core_pattern does not include "%p" (default does not)
and core_uses_pid is set, then .PID will be appended to
the filename.

==============================================================

ctrl-alt-del:

When the value in this file is 0, ctrl-alt-del is trapped and
sent to the init(1) program to handle a graceful restart.
When, however, the value is > 0, Linux's reaction to a Vulcan
Nerve Pinch (tm) will be an immediate reboot, without even
syncing its dirty buffers.

Note: when a program (like dosemu) has the keyboard in 'raw'
mode, the ctrl-alt-del is intercepted by the program before it
ever reaches the kernel tty layer, and it's up to the program
to decide what to do with it.

==============================================================

domainname & hostname:

These files can be used to set the NIS/YP domainname and the
hostname of your box in exactly the same way as the commands
domainname and hostname, i.e.:
# echo "darkstar" > /proc/sys/kernel/hostname
# echo "mydomain" > /proc/sys/kernel/domainname
has the same effect as
# hostname "darkstar"
# domainname "mydomain"

Note, however, that the classic darkstar.frop.org has the
hostname "darkstar" and DNS (Internet Domain Name Server)
domainname "frop.org", not to be confused with the NIS (Network
Information Service) or YP (Yellow Pages) domainname. These two
domain names are in general different. For a detailed discussion
see the hostname(1) man page.

==============================================================

hotplug:

Path for the hotplug policy agent.
Default value is "/sbin/hotplug".

==============================================================

l2cr: (PPC only)

This flag controls the L2 cache of G3 processor boards. If
0, the cache is disabled. Enabled if nonzero.

==============================================================

197 198 199 200 201 202 203
kstack_depth_to_print: (X86 only)

Controls the number of words to print when dumping the raw
kernel stack.

==============================================================

204 205 206 207 208 209 210 211 212 213
modules_disabled:

A toggle value indicating if modules are allowed to be loaded
in an otherwise modular kernel.  This toggle defaults to off
(0), but can be set true (1).  Once true, modules can be
neither loaded nor unloaded, and the toggle cannot be set back
to false.

==============================================================

L
Linus Torvalds 已提交
214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256
osrelease, ostype & version:

# cat osrelease
2.1.88
# cat ostype
Linux
# cat version
#5 Wed Feb 25 21:49:24 MET 1998

The files osrelease and ostype should be clear enough. Version
needs a little more clarification however. The '#5' means that
this is the fifth kernel built from this source base and the
date behind it indicates the time the kernel was built.
The only way to tune these values is to rebuild the kernel :-)

==============================================================

overflowgid & overflowuid:

if your architecture did not always support 32-bit UIDs (i.e. arm, i386,
m68k, sh, and sparc32), a fixed UID and GID will be returned to
applications that use the old 16-bit UID/GID system calls, if the actual
UID or GID would exceed 65535.

These sysctls allow you to change the value of the fixed UID and GID.
The default is 65534.

==============================================================

panic:

The value in this file represents the number of seconds the
kernel waits before rebooting on a panic. When you use the
software watchdog, the recommended setting is 60.

==============================================================

panic_on_oops:

Controls the kernel's behaviour when an oops or BUG is encountered.

0: try to continue operation

M
Matt LaPlante 已提交
257
1: panic immediately.  If the `panic' sysctl is also non-zero then the
258
   machine will be rebooted.
L
Linus Torvalds 已提交
259 260 261 262 263

==============================================================

pid_max:

264
PID allocation wrap value.  When the kernel's next PID value
L
Linus Torvalds 已提交
265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315
reaches this value, it wraps back to a minimum PID value.
PIDs of value pid_max or larger are not allocated.

==============================================================

powersave-nap: (PPC only)

If set, Linux-PPC will use the 'nap' mode of powersaving,
otherwise the 'doze' mode will be used.

==============================================================

printk:

The four values in printk denote: console_loglevel,
default_message_loglevel, minimum_console_loglevel and
default_console_loglevel respectively.

These values influence printk() behavior when printing or
logging error messages. See 'man 2 syslog' for more info on
the different loglevels.

- console_loglevel: messages with a higher priority than
  this will be printed to the console
- default_message_level: messages without an explicit priority
  will be printed with this priority
- minimum_console_loglevel: minimum (highest) value to which
  console_loglevel can be set
- default_console_loglevel: default value for console_loglevel

==============================================================

printk_ratelimit:

Some warning messages are rate limited. printk_ratelimit specifies
the minimum length of time between these messages (in jiffies), by
default we allow one every 5 seconds.

A value of 0 will disable rate limiting.

==============================================================

printk_ratelimit_burst:

While long term we enforce one message per printk_ratelimit
seconds, we do allow a burst of messages to pass through.
printk_ratelimit_burst specifies the number of messages we can
send before ratelimiting kicks in.

==============================================================

316 317 318 319 320 321 322 323
printk_delay:

Delay each printk message in printk_delay milliseconds

Value from 0 - 10000 is allowed.

==============================================================

324 325 326 327 328 329
randomize-va-space:

This option can be used to select the type of process address
space randomization that is used in the system, for architectures
that support this feature.

330 331 332
0 - Turn the process address space randomization off.  This is the
    default for architectures that do not support this feature anyways,
    and kernels that are booted with the "norandmaps" parameter.
333 334 335

1 - Make the addresses of mmap base, stack and VDSO page randomized.
    This, among other things, implies that shared libraries will be
336 337 338
    loaded to random addresses.  Also for PIE-linked binaries, the
    location of code start is randomized.  This is the default if the
    CONFIG_COMPAT_BRK option is enabled.
339

340 341 342 343
2 - Additionally enable heap randomization.  This is the default if
    CONFIG_COMPAT_BRK is disabled.

    There are a few legacy applications out there (such as some ancient
344
    versions of libc.so.5 from 1996) that assume that brk area starts
345 346
    just after the end of the code+bss.  These applications break when
    start of the brk area is randomized.  There are however no known
347
    non-legacy applications that would be broken this way, so for most
348 349 350 351 352
    systems it is safe to choose full randomization.

    Systems with ancient and/or broken binaries should be configured
    with CONFIG_COMPAT_BRK enabled, which excludes the heap from process
    address space randomization.
353 354 355

==============================================================

L
Linus Torvalds 已提交
356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395
reboot-cmd: (Sparc only)

??? This seems to be a way to give an argument to the Sparc
ROM/Flash boot loader. Maybe to tell it what to do after
rebooting. ???

==============================================================

rtsig-max & rtsig-nr:

The file rtsig-max can be used to tune the maximum number
of POSIX realtime (queued) signals that can be outstanding
in the system.

rtsig-nr shows the number of RT signals currently queued.

==============================================================

sg-big-buff:

This file shows the size of the generic SCSI (sg) buffer.
You can't tune it just yet, but you could change it on
compile time by editing include/scsi/sg.h and changing
the value of SG_BIG_BUFF.

There shouldn't be any reason to change this value. If
you can come up with one, you probably know what you
are doing anyway :)

==============================================================

shmmax: 

This value can be used to query and set the run time limit
on the maximum shared memory segment size that can be created.
Shared memory segments up to 1Gb are now supported in the 
kernel.  This value defaults to SHMMAX.

==============================================================

396 397
softlockup_thresh:

398 399 400 401
This value can be used to lower the softlockup tolerance threshold.  The
default threshold is 60 seconds.  If a cpu is locked up for 60 seconds,
the kernel complains.  Valid values are 1-60 seconds.  Setting this
tunable to zero will disable the softlockup detection altogether.
402 403 404

==============================================================

L
Linus Torvalds 已提交
405 406 407 408 409
tainted: 

Non-zero if the kernel has been tainted.  Numeric values, which
can be ORed together:

410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426
   1 - A module with a non-GPL license has been loaded, this
       includes modules with no license.
       Set by modutils >= 2.4.9 and module-init-tools.
   2 - A module was force loaded by insmod -f.
       Set by modutils >= 2.4.9 and module-init-tools.
   4 - Unsafe SMP processors: SMP with CPUs not designed for SMP.
   8 - A module was forcibly unloaded from the system by rmmod -f.
  16 - A hardware machine check error occurred on the system.
  32 - A bad page was discovered on the system.
  64 - The user has asked that the system be marked "tainted".  This
       could be because they are running software that directly modifies
       the hardware, or for other reasons.
 128 - The system has died.
 256 - The ACPI DSDT has been overridden with one supplied by the user
        instead of using the one provided by the hardware.
 512 - A kernel warning has occurred.
1024 - A module from drivers/staging was loaded.
L
Linus Torvalds 已提交
427

428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475
==============================================================

auto_msgmni:

Enables/Disables automatic recomputing of msgmni upon memory add/remove or
upon ipc namespace creation/removal (see the msgmni description above).
Echoing "1" into this file enables msgmni automatic recomputing.
Echoing "0" turns it off.
auto_msgmni default value is 1.

==============================================================

nmi_watchdog:

Enables/Disables the NMI watchdog on x86 systems.  When the value is non-zero
the NMI watchdog is enabled and will continuously test all online cpus to
determine whether or not they are still functioning properly. Currently,
passing "nmi_watchdog=" parameter at boot time is required for this function
to work.

If LAPIC NMI watchdog method is in use (nmi_watchdog=2 kernel parameter), the
NMI watchdog shares registers with oprofile. By disabling the NMI watchdog,
oprofile may have more registers to utilize.

==============================================================

unknown_nmi_panic:

The value in this file affects behavior of handling NMI. When the value is
non-zero, unknown NMI is trapped and then panic occurs. At that time, kernel
debugging information is displayed on console.

NMI switch that most IA32 servers have fires unknown NMI up, for example.
If a system hangs up, try pressing the NMI switch.

==============================================================

panic_on_unrecovered_nmi:

The default Linux behaviour on an NMI of either memory or unknown is to continue
operation. For many environments such as scientific computing it is preferable
that the box is taken out and the error dealt with than an uncorrected
parity/ECC error get propogated.

A small number of systems do generate NMI's for bizarre random reasons such as
power management so the default is off. That sysctl works like the existing
panic controls already in that directory.