提交 3a20ac2c 编写于 作者: T Takashi Iwai

Merge branch 'fix/pcm-jiffies-check' into fix/asoc

要显示的变更太多。

To preserve performance only 1000 of 1000+ files are displayed.
......@@ -49,6 +49,7 @@ include/linux/compile.h
include/linux/version.h
include/linux/utsrelease.h
include/linux/bounds.h
include/generated
# stgit generated dirs
patches-*
......
......@@ -92,6 +92,7 @@ Rudolf Marek <R.Marek@sh.cvut.cz>
Rui Saraiva <rmps@joel.ist.utl.pt>
Sachin P Sant <ssant@in.ibm.com>
Sam Ravnborg <sam@mars.ravnborg.org>
Sascha Hauer <s.hauer@pengutronix.de>
S.Çağlar Onur <caglar@pardus.org.tr>
Simon Kelley <simon@thekelleys.org.uk>
Stéphane Witzmann <stephane.witzmann@ubpmes.univ-bpclermont.fr>
......@@ -100,6 +101,7 @@ Tejun Heo <htejun@gmail.com>
Thomas Graf <tgraf@suug.ch>
Tony Luck <tony.luck@intel.com>
Tsuneo Yoshioka <Tsuneo.Yoshioka@f-secure.com>
Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com>
Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
Uwe Kleine-König <ukl@pengutronix.de>
Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com>
Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
......@@ -495,6 +495,11 @@ S: Kopmansg 2
S: 411 13 Goteborg
S: Sweden
N: Paul Bristow
E: paul@paulbristow.net
W: http://paulbristow.net/linux/idefloppy.html
D: Maintainer of IDE/ATAPI floppy driver
N: Dominik Brodowski
E: linux@brodo.de
W: http://www.brodo.de/
......@@ -1407,8 +1412,8 @@ P: 1024D/77D4FC9B F5C5 1C20 1DFC DEC3 3107 54A4 2332 ADFC 77D4 FC9B
D: National Language Support
D: Linux Internationalization Project
D: German Localization for Linux and GNU software
S: Kriemhildring 12a
S: 65795 Hattersheim am Main
S: Auf der Fittel 18
S: 53347 Alfter
S: Germany
N: Christoph Hellwig
......@@ -2166,7 +2171,6 @@ D: Initial implementation of VC's, pty's and select()
N: Pavel Machek
E: pavel@ucw.cz
E: pavel@suse.cz
D: Softcursor for vga, hypertech cdrom support, vcsa bugfix, nbd
D: sun4/330 port, capabilities for elf, speedup for rm on ext2, USB,
D: work on suspend-to-ram/disk, killing duplicates from ioctl32
......@@ -2643,6 +2647,10 @@ S: C/ Mieses 20, 9-B
S: Valladolid 47009
S: Spain
N: Gadi Oxman
E: gadio@netvision.net.il
D: Original author and maintainer of IDE/ATAPI floppy/tape drivers
N: Greg Page
E: gpage@sovereign.org
D: IPX development and support
......@@ -3572,6 +3580,12 @@ N: Dirk Verworner
D: Co-author of German book ``Linux-Kernel-Programmierung''
D: Co-founder of Berlin Linux User Group
N: Riku Voipio
E: riku.voipio@iki.fi
D: Author of PCA9532 LED and Fintek f75375s hwmon driver
D: Some random ARM board patches
S: Finland
N: Patrick Volkerding
E: volkerdi@ftp.cdrom.com
D: Produced the Slackware distribution, updated the SVGAlib
......@@ -3739,7 +3753,7 @@ S: 93149 Nittenau
S: Germany
N: Gertjan van Wingerde
E: gwingerde@home.nl
E: gwingerde@gmail.com
D: Ralink rt2x00 WLAN driver
D: Minix V2 file-system
D: Misc fixes
......@@ -3786,14 +3800,11 @@ S: The Netherlands
N: David Woodhouse
E: dwmw2@infradead.org
D: ARCnet stuff, Applicom board driver, SO_BINDTODEVICE,
D: some Alpha platform porting from 2.0, Memory Technology Devices,
D: Acquire watchdog timer, PC speaker driver maintenance,
D: JFFS2 file system, Memory Technology Device subsystem,
D: various other stuff that annoyed me by not working.
S: c/o Red Hat Engineering
S: Rustat House
S: 60 Clifton Road
S: Cambridge. CB1 7EG
S: c/o Intel Corporation
S: Pipers Way
S: Swindon. SN3 1RJ
S: England
N: Chris Wright
......
......@@ -86,6 +86,8 @@ cachetlb.txt
- describes the cache/TLB flushing interfaces Linux uses.
cdrom/
- directory with information on the CD-ROM drivers that Linux has.
cgroups/
- cgroups features, including cpusets and memory controller.
connector/
- docs on the netlink based userspace<->kernel space communication mod.
console/
......@@ -98,8 +100,6 @@ cpu-load.txt
- document describing how CPU load statistics are collected.
cpuidle/
- info on CPU_IDLE, CPU idle state management subsystem.
cpusets.txt
- documents the cpusets feature; assign CPUs and Mem to a set of tasks.
cputopology.txt
- documentation on how CPU topology info is exported via sysfs.
cris/
......
What: /sys/kernel/debug/kmemtrace/
Date: July 2008
Contact: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Description:
In kmemtrace-enabled kernels, the following files are created:
/sys/kernel/debug/kmemtrace/
cpu<n> (0400) Per-CPU tracing data, see below. (binary)
total_overruns (0400) Total number of bytes which were dropped from
cpu<n> files because of full buffer condition,
non-binary. (text)
abi_version (0400) Kernel's kmemtrace ABI version. (text)
Each per-CPU file should be read according to the relay interface. That is,
the reader should set affinity to that specific CPU and, as currently done by
the userspace application (though there are other methods), use poll() with
an infinite timeout before every read(). Otherwise, erroneous data may be
read. The binary data has the following _core_ format:
Event ID (1 byte) Unsigned integer, one of:
0 - represents an allocation (KMEMTRACE_EVENT_ALLOC)
1 - represents a freeing of previously allocated memory
(KMEMTRACE_EVENT_FREE)
Type ID (1 byte) Unsigned integer, one of:
0 - this is a kmalloc() / kfree()
1 - this is a kmem_cache_alloc() / kmem_cache_free()
2 - this is a __get_free_pages() et al.
Event size (2 bytes) Unsigned integer representing the
size of this event. Used to extend
kmemtrace. Discard the bytes you
don't know about.
Sequence number (4 bytes) Signed integer used to reorder data
logged on SMP machines. Wraparound
must be taken into account, although
it is unlikely.
Caller address (8 bytes) Return address to the caller.
Pointer to mem (8 bytes) Pointer to target memory area. Can be
NULL, but not all such calls might be
recorded.
In case of KMEMTRACE_EVENT_ALLOC events, the next fields follow:
Requested bytes (8 bytes) Total number of requested bytes,
unsigned, must not be zero.
Allocated bytes (8 bytes) Total number of actually allocated
bytes, unsigned, must not be lower
than requested bytes.
Requested flags (4 bytes) GFP flags supplied by the caller.
Target CPU (4 bytes) Signed integer, valid for event id 1.
If equal to -1, target CPU is the same
as origin CPU, but the reverse might
not be true.
The data is made available in the same endianness the machine has.
Other event ids and type ids may be defined and added. Other fields may be
added by increasing event size, but see below for details.
Every modification to the ABI, including new id definitions, are followed
by bumping the ABI version by one.
Adding new data to the packet (features) is done at the end of the mandatory
data:
Feature size (2 byte)
Feature ID (1 byte)
Feature data (Feature size - 3 bytes)
Users:
kmemtrace-user - git://repo.or.cz/kmemtrace-user.git
What: /debug/pktcdvd/pktcdvd[0-7]
What: /sys/kernel/debug/pktcdvd/pktcdvd[0-7]
Date: Oct. 2006
KernelVersion: 2.6.20
Contact: Thomas Maier <balagi@justmail.de>
......@@ -10,10 +10,10 @@ debugfs interface
The pktcdvd module (packet writing driver) creates
these files in debugfs:
/debug/pktcdvd/pktcdvd[0-7]/
/sys/kernel/debug/pktcdvd/pktcdvd[0-7]/
info (0444) Lots of driver statistics and infos.
Example:
-------
cat /debug/pktcdvd/pktcdvd0/info
cat /sys/kernel/debug/pktcdvd/pktcdvd0/info
What: security/ima/policy
Date: May 2008
Contact: Mimi Zohar <zohar@us.ibm.com>
Description:
The Trusted Computing Group(TCG) runtime Integrity
Measurement Architecture(IMA) maintains a list of hash
values of executables and other sensitive system files
loaded into the run-time of this system. At runtime,
the policy can be constrained based on LSM specific data.
Policies are loaded into the securityfs file ima/policy
by opening the file, writing the rules one at a time and
then closing the file. The new policy takes effect after
the file ima/policy is closed.
rule format: action [condition ...]
action: measure | dont_measure
condition:= base | lsm
base: [[func=] [mask=] [fsmagic=] [uid=]]
lsm: [[subj_user=] [subj_role=] [subj_type=]
[obj_user=] [obj_role=] [obj_type=]]
base: func:= [BPRM_CHECK][FILE_MMAP][INODE_PERMISSION]
mask:= [MAY_READ] [MAY_WRITE] [MAY_APPEND] [MAY_EXEC]
fsmagic:= hex value
uid:= decimal value
lsm: are LSM specific
default policy:
# PROC_SUPER_MAGIC
dont_measure fsmagic=0x9fa0
# SYSFS_MAGIC
dont_measure fsmagic=0x62656572
# DEBUGFS_MAGIC
dont_measure fsmagic=0x64626720
# TMPFS_MAGIC
dont_measure fsmagic=0x01021994
# SECURITYFS_MAGIC
dont_measure fsmagic=0x73636673
measure func=BPRM_CHECK
measure func=FILE_MMAP mask=MAY_EXEC
measure func=INODE_PERM mask=MAY_READ uid=0
The default policy measures all executables in bprm_check,
all files mmapped executable in file_mmap, and all files
open for read by root in inode_permission.
Examples of LSM specific definitions:
SELinux:
# SELINUX_MAGIC
dont_measure fsmagic=0xF97CFF8C
dont_measure obj_type=var_log_t
dont_measure obj_type=auditd_log_t
measure subj_user=system_u func=INODE_PERM mask=MAY_READ
measure subj_role=system_r func=INODE_PERM mask=MAY_READ
Smack:
measure subj_user=_ func=INODE_PERM mask=MAY_READ
What: /sys/bus/pci/drivers/.../bind
Date: December 2003
Contact: linux-pci@vger.kernel.org
Description:
Writing a device location to this file will cause
the driver to attempt to bind to the device found at
this location. This is useful for overriding default
bindings. The format for the location is: DDDD:BB:DD.F.
That is Domain:Bus:Device.Function and is the same as
found in /sys/bus/pci/devices/. For example:
# echo 0000:00:19.0 > /sys/bus/pci/drivers/foo/bind
(Note: kernels before 2.6.28 may require echo -n).
What: /sys/bus/pci/drivers/.../unbind
Date: December 2003
Contact: linux-pci@vger.kernel.org
Description:
Writing a device location to this file will cause the
driver to attempt to unbind from the device found at
this location. This may be useful when overriding default
bindings. The format for the location is: DDDD:BB:DD.F.
That is Domain:Bus:Device.Function and is the same as
found in /sys/bus/pci/devices/. For example:
# echo 0000:00:19.0 > /sys/bus/pci/drivers/foo/unbind
(Note: kernels before 2.6.28 may require echo -n).
What: /sys/bus/pci/drivers/.../new_id
Date: December 2003
Contact: linux-pci@vger.kernel.org
Description:
Writing a device ID to this file will attempt to
dynamically add a new device ID to a PCI device driver.
This may allow the driver to support more hardware than
was included in the driver's static device ID support
table at compile time. The format for the device ID is:
VVVV DDDD SVVV SDDD CCCC MMMM PPPP. That is Vendor ID,
Device ID, Subsystem Vendor ID, Subsystem Device ID,
Class, Class Mask, and Private Driver Data. The Vendor ID
and Device ID fields are required, the rest are optional.
Upon successfully adding an ID, the driver will probe
for the device and attempt to bind to it. For example:
# echo "8086 10f5" > /sys/bus/pci/drivers/foo/new_id
What: /sys/bus/pci/drivers/.../remove_id
Date: February 2009
Contact: Chris Wright <chrisw@sous-sol.org>
Description:
Writing a device ID to this file will remove an ID
that was dynamically added via the new_id sysfs entry.
The format for the device ID is:
VVVV DDDD SVVV SDDD CCCC MMMM. That is Vendor ID, Device
ID, Subsystem Vendor ID, Subsystem Device ID, Class,
and Class Mask. The Vendor ID and Device ID fields are
required, the rest are optional. After successfully
removing an ID, the driver will no longer support the
device. This is useful to ensure auto probing won't
match the driver to the device. For example:
# echo "8086 10f5" > /sys/bus/pci/drivers/foo/remove_id
What: /sys/bus/pci/rescan
Date: January 2009
Contact: Linux PCI developers <linux-pci@vger.kernel.org>
Description:
Writing a non-zero value to this attribute will
force a rescan of all PCI buses in the system, and
re-discover previously removed devices.
Depends on CONFIG_HOTPLUG.
What: /sys/bus/pci/devices/.../remove
Date: January 2009
Contact: Linux PCI developers <linux-pci@vger.kernel.org>
Description:
Writing a non-zero value to this attribute will
hot-remove the PCI device and any of its children.
Depends on CONFIG_HOTPLUG.
What: /sys/bus/pci/devices/.../rescan
Date: January 2009
Contact: Linux PCI developers <linux-pci@vger.kernel.org>
Description:
Writing a non-zero value to this attribute will
force a rescan of the device's parent bus and all
child buses, and re-discover devices removed earlier
from this part of the device tree.
Depends on CONFIG_HOTPLUG.
What: /sys/bus/pci/devices/.../vpd
Date: February 2008
Contact: Ben Hutchings <bhutchings@solarflare.com>
......@@ -9,3 +95,30 @@ Description:
that some devices may have malformatted data. If the
underlying VPD has a writable section then the
corresponding section of this file will be writable.
What: /sys/bus/pci/devices/.../virtfnN
Date: March 2009
Contact: Yu Zhao <yu.zhao@intel.com>
Description:
This symbolic link appears when hardware supports the SR-IOV
capability and the Physical Function driver has enabled it.
The symbolic link points to the PCI device sysfs entry of the
Virtual Function whose index is N (0...MaxVFs-1).
What: /sys/bus/pci/devices/.../dep_link
Date: March 2009
Contact: Yu Zhao <yu.zhao@intel.com>
Description:
This symbolic link appears when hardware supports the SR-IOV
capability and the Physical Function driver has enabled it,
and this device has vendor specific dependencies with others.
The symbolic link points to the PCI device sysfs entry of
Physical Function this device depends on.
What: /sys/bus/pci/devices/.../physfn
Date: March 2009
Contact: Yu Zhao <yu.zhao@intel.com>
Description:
This symbolic link appears when a device is a Virtual Function.
The symbolic link points to the PCI device sysfs entry of the
Physical Function this device associates with.
......@@ -4,8 +4,8 @@ KernelVersion: 2.6.26
Contact: Liam Girdwood <lrg@slimlogic.co.uk>
Description:
Some regulator directories will contain a field called
state. This reports the regulator enable status, for
regulators which can report that value.
state. This reports the regulator enable control, for
regulators which can report that input value.
This will be one of the following strings:
......@@ -14,16 +14,54 @@ Description:
'unknown'
'enabled' means the regulator output is ON and is supplying
power to the system.
power to the system (assuming no error prevents it).
'disabled' means the regulator output is OFF and is not
supplying power to the system..
supplying power to the system (unless some non-Linux
control has enabled it).
'unknown' means software cannot determine the state, or
the reported state is invalid.
NOTE: this field can be used in conjunction with microvolts
and microamps to determine regulator output levels.
or microamps to determine configured regulator output levels.
What: /sys/class/regulator/.../status
Description:
Some regulator directories will contain a field called
"status". This reports the current regulator status, for
regulators which can report that output value.
This will be one of the following strings:
off
on
error
fast
normal
idle
standby
"off" means the regulator is not supplying power to the
system.
"on" means the regulator is supplying power to the system,
and the regulator can't report a detailed operation mode.
"error" indicates an out-of-regulation status such as being
disabled due to thermal shutdown, or voltage being unstable
because of problems with the input power supply.
"fast", "normal", "idle", and "standby" are all detailed
regulator operation modes (described elsewhere). They
imply "on", but provide more detail.
Note that regulator status is a function of many inputs,
not limited to control inputs from Linux. For example,
the actual load presented may trigger "error" status; or
a regulator may be enabled by another user, even though
Linux did not enable it.
What: /sys/class/regulator/.../type
......@@ -58,7 +96,7 @@ Description:
Some regulator directories will contain a field called
microvolts. This holds the regulator output voltage setting
measured in microvolts (i.e. E-6 Volts), for regulators
which can report that voltage.
which can report the control input for voltage.
NOTE: This value should not be used to determine the regulator
output voltage level as this value is the same regardless of
......@@ -73,7 +111,7 @@ Description:
Some regulator directories will contain a field called
microamps. This holds the regulator output current limit
setting measured in microamps (i.e. E-6 Amps), for regulators
which can report that current.
which can report the control input for a current limit.
NOTE: This value should not be used to determine the regulator
output current level as this value is the same regardless of
......@@ -87,7 +125,7 @@ Contact: Liam Girdwood <lrg@slimlogic.co.uk>
Description:
Some regulator directories will contain a field called
opmode. This holds the current regulator operating mode,
for regulators which can report it.
for regulators which can report that control input value.
The opmode value can be one of the following strings:
......@@ -101,7 +139,8 @@ Description:
NOTE: This value should not be used to determine the regulator
output operating mode as this value is the same regardless of
whether the regulator is enabled or disabled.
whether the regulator is enabled or disabled. A "status"
attribute may be available to determine the actual mode.
What: /sys/class/regulator/.../min_microvolts
......
......@@ -69,9 +69,13 @@ Description:
gpe1F: 0 invalid
gpe_all: 1192
sci: 1194
sci_not: 0
sci - The total number of times the ACPI SCI
has claimed an interrupt.
sci - The number of times the ACPI SCI
has been called and claimed an interrupt.
sci_not - The number of times the ACPI SCI
has been called and NOT claimed an interrupt.
gpe_all - count of SCI caused by GPEs.
......
What: /sys/firmware/memmap/
Date: June 2008
Contact: Bernhard Walle <bwalle@suse.de>
Contact: Bernhard Walle <bernhard.walle@gmx.de>
Description:
On all platforms, the firmware provides a memory map which the
kernel reads. The resources from that memory map are registered
......
What: /sys/fs/ext4/<disk>/mb_stats
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
Controls whether the multiblock allocator should
collect statistics, which are shown during the unmount.
1 means to collect statistics, 0 means not to collect
statistics
What: /sys/fs/ext4/<disk>/mb_group_prealloc
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
The multiblock allocator will round up allocation
requests to a multiple of this tuning parameter if the
stripe size is not set in the ext4 superblock
What: /sys/fs/ext4/<disk>/mb_max_to_scan
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
The maximum number of extents the multiblock allocator
will search to find the best extent
What: /sys/fs/ext4/<disk>/mb_min_to_scan
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
The minimum number of extents the multiblock allocator
will search to find the best extent
What: /sys/fs/ext4/<disk>/mb_order2_req
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
Tuning parameter which controls the minimum size for
requests (as a power of 2) where the buddy cache is
used
What: /sys/fs/ext4/<disk>/mb_stream_req
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
Files which have fewer blocks than this tunable
parameter will have their blocks allocated out of a
block group specific preallocation pool, so that small
files are packed closely together. Each large file
will have its blocks allocated out of its own unique
preallocation pool.
What: /sys/fs/ext4/<disk>/inode_readahead
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
Tuning parameter which controls the maximum number of
inode table blocks that ext4's inode table readahead
algorithm will pre-read into the buffer cache
What: /sys/fs/ext4/<disk>/delayed_allocation_blocks
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
This file is read-only and shows the number of blocks
that are dirty in the page cache, but which do not
have their location in the filesystem allocated yet.
What: /sys/fs/ext4/<disk>/lifetime_write_kbytes
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
This file is read-only and shows the number of kilobytes
of data that have been written to this filesystem since it was
created.
What: /sys/fs/ext4/<disk>/session_write_kbytes
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
This file is read-only and shows the number of
kilobytes of data that have been written to this
filesystem since it was mounted.
......@@ -33,10 +33,12 @@ o Gnu make 3.79.1 # make --version
o binutils 2.12 # ld -v
o util-linux 2.10o # fdformat --version
o module-init-tools 0.9.10 # depmod -V
o e2fsprogs 1.29 # tune2fs
o e2fsprogs 1.41.4 # e2fsck -V
o jfsutils 1.1.3 # fsck.jfs -V
o reiserfsprogs 3.6.3 # reiserfsck -V 2>&1|grep reiserfsprogs
o xfsprogs 2.6.0 # xfs_db -V
o squashfs-tools 4.0 # mksquashfs -version
o btrfs-progs 0.18 # btrfsck
o pcmciautils 004 # pccardctl -V
o quota-tools 3.09 # quota -V
o PPP 2.4.0 # pppd --version
......
......@@ -483,17 +483,25 @@ values. To do the latter, you can stick the following in your .emacs file:
(* (max steps 1)
c-basic-offset)))
(add-hook 'c-mode-common-hook
(lambda ()
;; Add kernel style
(c-add-style
"linux-tabs-only"
'("linux" (c-offsets-alist
(arglist-cont-nonempty
c-lineup-gcc-asm-reg
c-lineup-arglist-tabs-only))))))
(add-hook 'c-mode-hook
(lambda ()
(let ((filename (buffer-file-name)))
;; Enable kernel mode for the appropriate files
(when (and filename
(string-match "~/src/linux-trees" filename))
(string-match (expand-file-name "~/src/linux-trees")
filename))
(setq indent-tabs-mode t)
(c-set-style "linux")
(c-set-offset 'arglist-cont-nonempty
'(c-lineup-gcc-asm-reg
c-lineup-arglist-tabs-only))))))
(c-set-style "linux-tabs-only")))))
This will make emacs go better with the kernel coding style for C
files below ~/src/linux-trees.
......
......@@ -5,7 +5,7 @@
This document describes the DMA API. For a more gentle introduction
phrased in terms of the pci_ equivalents (and actual examples) see
DMA-mapping.txt
Documentation/PCI/PCI-DMA-mapping.txt.
This API is split into two pieces. Part I describes the API and the
corresponding pci_ API. Part II describes the extensions to the API
......@@ -170,16 +170,15 @@ Returns: 0 if successful and a negative error if not.
u64
dma_get_required_mask(struct device *dev)
After setting the mask with dma_set_mask(), this API returns the
actual mask (within that already set) that the platform actually
requires to operate efficiently. Usually this means the returned mask
This API returns the mask that the platform requires to
operate efficiently. Usually this means the returned mask
is the minimum required to cover all of memory. Examining the
required mask gives drivers with variable descriptor sizes the
opportunity to use smaller descriptors as necessary.
Requesting the required mask does not alter the current mask. If you
wish to take advantage of it, you should issue another dma_set_mask()
call to lower the mask again.
wish to take advantage of it, you should issue a dma_set_mask()
call to set the mask to the value returned.
Part Id - Streaming DMA mappings
......@@ -610,3 +609,109 @@ size is the size (and should be a page-sized multiple).
The return value will be either a pointer to the processor virtual
address of the memory, or an error (via PTR_ERR()) if any part of the
region is occupied.
Part III - Debug drivers use of the DMA-API
-------------------------------------------
The DMA-API as described above as some constraints. DMA addresses must be
released with the corresponding function with the same size for example. With
the advent of hardware IOMMUs it becomes more and more important that drivers
do not violate those constraints. In the worst case such a violation can
result in data corruption up to destroyed filesystems.
To debug drivers and find bugs in the usage of the DMA-API checking code can
be compiled into the kernel which will tell the developer about those
violations. If your architecture supports it you can select the "Enable
debugging of DMA-API usage" option in your kernel configuration. Enabling this
option has a performance impact. Do not enable it in production kernels.
If you boot the resulting kernel will contain code which does some bookkeeping
about what DMA memory was allocated for which device. If this code detects an
error it prints a warning message with some details into your kernel log. An
example warning message may look like this:
------------[ cut here ]------------
WARNING: at /data2/repos/linux-2.6-iommu/lib/dma-debug.c:448
check_unmap+0x203/0x490()
Hardware name:
forcedeth 0000:00:08.0: DMA-API: device driver frees DMA memory with wrong
function [device address=0x00000000640444be] [size=66 bytes] [mapped as
single] [unmapped as page]
Modules linked in: nfsd exportfs bridge stp llc r8169
Pid: 0, comm: swapper Tainted: G W 2.6.28-dmatest-09289-g8bb99c0 #1
Call Trace:
<IRQ> [<ffffffff80240b22>] warn_slowpath+0xf2/0x130
[<ffffffff80647b70>] _spin_unlock+0x10/0x30
[<ffffffff80537e75>] usb_hcd_link_urb_to_ep+0x75/0xc0
[<ffffffff80647c22>] _spin_unlock_irqrestore+0x12/0x40
[<ffffffff8055347f>] ohci_urb_enqueue+0x19f/0x7c0
[<ffffffff80252f96>] queue_work+0x56/0x60
[<ffffffff80237e10>] enqueue_task_fair+0x20/0x50
[<ffffffff80539279>] usb_hcd_submit_urb+0x379/0xbc0
[<ffffffff803b78c3>] cpumask_next_and+0x23/0x40
[<ffffffff80235177>] find_busiest_group+0x207/0x8a0
[<ffffffff8064784f>] _spin_lock_irqsave+0x1f/0x50
[<ffffffff803c7ea3>] check_unmap+0x203/0x490
[<ffffffff803c8259>] debug_dma_unmap_page+0x49/0x50
[<ffffffff80485f26>] nv_tx_done_optimized+0xc6/0x2c0
[<ffffffff80486c13>] nv_nic_irq_optimized+0x73/0x2b0
[<ffffffff8026df84>] handle_IRQ_event+0x34/0x70
[<ffffffff8026ffe9>] handle_edge_irq+0xc9/0x150
[<ffffffff8020e3ab>] do_IRQ+0xcb/0x1c0
[<ffffffff8020c093>] ret_from_intr+0x0/0xa
<EOI> <4>---[ end trace f6435a98e2a38c0e ]---
The driver developer can find the driver and the device including a stacktrace
of the DMA-API call which caused this warning.
Per default only the first error will result in a warning message. All other
errors will only silently counted. This limitation exist to prevent the code
from flooding your kernel log. To support debugging a device driver this can
be disabled via debugfs. See the debugfs interface documentation below for
details.
The debugfs directory for the DMA-API debugging code is called dma-api/. In
this directory the following files can currently be found:
dma-api/all_errors This file contains a numeric value. If this
value is not equal to zero the debugging code
will print a warning for every error it finds
into the kernel log. Be carefull with this
option. It can easily flood your logs.
dma-api/disabled This read-only file contains the character 'Y'
if the debugging code is disabled. This can
happen when it runs out of memory or if it was
disabled at boot time
dma-api/error_count This file is read-only and shows the total
numbers of errors found.
dma-api/num_errors The number in this file shows how many
warnings will be printed to the kernel log
before it stops. This number is initialized to
one at system boot and be set by writing into
this file
dma-api/min_free_entries
This read-only file can be read to get the
minimum number of free dma_debug_entries the
allocator has ever seen. If this value goes
down to zero the code will disable itself
because it is not longer reliable.
dma-api/num_free_entries
The current number of free dma_debug_entries
in the allocator.
If you have this code compiled into your kernel it will be enabled by default.
If you want to boot without the bookkeeping anyway you can provide
'dma_debug=off' as a boot parameter. This will disable DMA-API debugging.
Notice that you can not enable it again at runtime. You have to reboot to do
so.
When the code disables itself at runtime this is most likely because it ran
out of dma_debug_entries. These entries are preallocated at boot. The number
of preallocated entries is defined per architecture. If it is too low for you
boot with 'dma_debug_entries=<your_desired_number>' to overwrite the
architectural default.
......@@ -136,7 +136,7 @@ exactly why.
The standard 32-bit addressing PCI device would do something like
this:
if (pci_set_dma_mask(pdev, DMA_32BIT_MASK)) {
if (pci_set_dma_mask(pdev, DMA_BIT_MASK(32))) {
printk(KERN_WARNING
"mydev: No suitable DMA available.\n");
goto ignore_this_device;
......@@ -155,9 +155,9 @@ all 64-bits when accessing streaming DMA:
int using_dac;
if (!pci_set_dma_mask(pdev, DMA_64BIT_MASK)) {
if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
using_dac = 1;
} else if (!pci_set_dma_mask(pdev, DMA_32BIT_MASK)) {
} else if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(32))) {
using_dac = 0;
} else {
printk(KERN_WARNING
......@@ -170,14 +170,14 @@ the case would look like this:
int using_dac, consistent_using_dac;
if (!pci_set_dma_mask(pdev, DMA_64BIT_MASK)) {
if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
using_dac = 1;
consistent_using_dac = 1;
pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK);
} else if (!pci_set_dma_mask(pdev, DMA_32BIT_MASK)) {
pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
} else if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(32))) {
using_dac = 0;
consistent_using_dac = 0;
pci_set_consistent_dma_mask(pdev, DMA_32BIT_MASK);
pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
} else {
printk(KERN_WARNING
"mydev: No suitable DMA available.\n");
......@@ -192,7 +192,7 @@ check the return value from pci_set_consistent_dma_mask().
Finally, if your device can only drive the low 24-bits of
address during PCI bus mastering you might do something like:
if (pci_set_dma_mask(pdev, DMA_24BIT_MASK)) {
if (pci_set_dma_mask(pdev, DMA_BIT_MASK(24))) {
printk(KERN_WARNING
"mydev: 24-bit DMA addressing not available.\n");
goto ignore_this_device;
......@@ -213,7 +213,7 @@ most specific mask.
Here is pseudo-code showing how this might be done:
#define PLAYBACK_ADDRESS_BITS DMA_32BIT_MASK
#define PLAYBACK_ADDRESS_BITS DMA_BIT_MASK(32)
#define RECORD_ADDRESS_BITS 0x00ffffff
struct my_sound_card *card;
......
......@@ -4,3 +4,7 @@
*.html
*.9.gz
*.9
*.aux
*.dvi
*.log
*.out
......@@ -6,13 +6,14 @@
# To add a new book the only step required is to add the book to the
# list of DOCBOOKS.
DOCBOOKS := z8530book.xml mcabook.xml \
DOCBOOKS := z8530book.xml mcabook.xml device-drivers.xml \
kernel-hacking.xml kernel-locking.xml deviceiobook.xml \
procfs-guide.xml writing_usb_driver.xml networking.xml \
kernel-api.xml filesystems.xml lsm.xml usb.xml kgdb.xml \
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
mac80211.xml debugobjects.xml sh.xml regulator.xml
mac80211.xml debugobjects.xml sh.xml regulator.xml \
alsa-driver-api.xml writing-an-alsa-driver.xml
###
# The build process is as follows (targets):
......@@ -30,7 +31,7 @@ PS_METHOD = $(prefer-db2x)
###
# The targets that may be used.
PHONY += xmldocs sgmldocs psdocs pdfdocs htmldocs mandocs installmandocs
PHONY += xmldocs sgmldocs psdocs pdfdocs htmldocs mandocs installmandocs cleandocs
BOOKS := $(addprefix $(obj)/,$(DOCBOOKS))
xmldocs: $(BOOKS)
......@@ -212,11 +213,12 @@ silent_gen_xml = :
dochelp:
@echo ' Linux kernel internal documentation in different formats:'
@echo ' htmldocs - HTML'
@echo ' installmandocs - install man pages generated by mandocs'
@echo ' mandocs - man pages'
@echo ' pdfdocs - PDF'
@echo ' psdocs - Postscript'
@echo ' xmldocs - XML DocBook'
@echo ' mandocs - man pages'
@echo ' installmandocs - install man pages generated by mandocs'
@echo ' cleandocs - clean all generated DocBook files'
###
# Temporary files left by various tools
......@@ -234,6 +236,10 @@ clean-files := $(DOCBOOKS) \
clean-dirs := $(patsubst %.xml,%,$(DOCBOOKS)) man
cleandocs:
$(Q)rm -f $(call objectify, $(clean-files))
$(Q)rm -rf $(call objectify, $(clean-dirs))
# Declare the contents of the .PHONY variable as phony. We keep that
# information in a variable se we can use it in if_changed and friends.
......
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
<!-- ****************************************************** -->
<!-- Header -->
<!-- ****************************************************** -->
<book id="ALSA-Driver-API">
<bookinfo>
<title>The ALSA Driver API</title>
<legalnotice>
<para>
This document is free; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
</para>
<para>
This document is distributed in the hope that it will be useful,
but <emphasis>WITHOUT ANY WARRANTY</emphasis>; without even the
implied warranty of <emphasis>MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE</emphasis>. See the GNU General Public License
for more details.
</para>
<para>
You should have received a copy of the GNU General Public
License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA 02111-1307 USA
</para>
</legalnotice>
</bookinfo>
<toc></toc>
<chapter><title>Management of Cards and Devices</title>
<sect1><title>Card Management</title>
!Esound/core/init.c
</sect1>
<sect1><title>Device Components</title>
!Esound/core/device.c
</sect1>
<sect1><title>Module requests and Device File Entries</title>
!Esound/core/sound.c
</sect1>
<sect1><title>Memory Management Helpers</title>
!Esound/core/memory.c
!Esound/core/memalloc.c
</sect1>
</chapter>
<chapter><title>PCM API</title>
<sect1><title>PCM Core</title>
!Esound/core/pcm.c
!Esound/core/pcm_lib.c
!Esound/core/pcm_native.c
</sect1>
<sect1><title>PCM Format Helpers</title>
!Esound/core/pcm_misc.c
</sect1>
<sect1><title>PCM Memory Management</title>
!Esound/core/pcm_memory.c
</sect1>
</chapter>
<chapter><title>Control/Mixer API</title>
<sect1><title>General Control Interface</title>
!Esound/core/control.c
</sect1>
<sect1><title>AC97 Codec API</title>
!Esound/pci/ac97/ac97_codec.c
!Esound/pci/ac97/ac97_pcm.c
</sect1>
<sect1><title>Virtual Master Control API</title>
!Esound/core/vmaster.c
!Iinclude/sound/control.h
</sect1>
</chapter>
<chapter><title>MIDI API</title>
<sect1><title>Raw MIDI API</title>
!Esound/core/rawmidi.c
</sect1>
<sect1><title>MPU401-UART API</title>
!Esound/drivers/mpu401/mpu401_uart.c
</sect1>
</chapter>
<chapter><title>Proc Info API</title>
<sect1><title>Proc Info Interface</title>
!Esound/core/info.c
</sect1>
</chapter>
<chapter><title>Miscellaneous Functions</title>
<sect1><title>Hardware-Dependent Devices API</title>
!Esound/core/hwdep.c
</sect1>
<sect1><title>Jack Abstraction Layer API</title>
!Esound/core/jack.c
</sect1>
<sect1><title>ISA DMA Helpers</title>
!Esound/core/isadma.c
</sect1>
<sect1><title>Other Helper Macros</title>
!Iinclude/sound/core.h
</sect1>
</chapter>
</book>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
<book id="LinuxDriversAPI">
<bookinfo>
<title>Linux Device Drivers</title>
<legalnotice>
<para>
This documentation is free software; you can redistribute
it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation; either
version 2 of the License, or (at your option) any later
version.
</para>
<para>
This program is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU General Public License for more details.
</para>
<para>
You should have received a copy of the GNU General Public
License along with this program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA 02111-1307 USA
</para>
<para>
For more details see the file COPYING in the source
distribution of Linux.
</para>
</legalnotice>
</bookinfo>
<toc></toc>
<chapter id="Basics">
<title>Driver Basics</title>
<sect1><title>Driver Entry and Exit points</title>
!Iinclude/linux/init.h
</sect1>
<sect1><title>Atomic and pointer manipulation</title>
!Iarch/x86/include/asm/atomic_32.h
!Iarch/x86/include/asm/unaligned.h
</sect1>
<sect1><title>Delaying, scheduling, and timer routines</title>
!Iinclude/linux/sched.h
!Ekernel/sched.c
!Ekernel/timer.c
</sect1>
<sect1><title>High-resolution timers</title>
!Iinclude/linux/ktime.h
!Iinclude/linux/hrtimer.h
!Ekernel/hrtimer.c
</sect1>
<sect1><title>Workqueues and Kevents</title>
!Ekernel/workqueue.c
</sect1>
<sect1><title>Internal Functions</title>
!Ikernel/exit.c
!Ikernel/signal.c
!Iinclude/linux/kthread.h
!Ekernel/kthread.c
</sect1>
<sect1><title>Kernel objects manipulation</title>
<!--
X!Iinclude/linux/kobject.h
-->
!Elib/kobject.c
</sect1>
<sect1><title>Kernel utility functions</title>
!Iinclude/linux/kernel.h
!Ekernel/printk.c
!Ekernel/panic.c
!Ekernel/sys.c
!Ekernel/rcupdate.c
</sect1>
<sect1><title>Device Resource Management</title>
!Edrivers/base/devres.c
</sect1>
</chapter>
<chapter id="devdrivers">
<title>Device drivers infrastructure</title>
<sect1><title>Device Drivers Base</title>
<!--
X!Iinclude/linux/device.h
-->
!Edrivers/base/driver.c
!Edrivers/base/core.c
!Edrivers/base/class.c
!Edrivers/base/firmware_class.c
!Edrivers/base/transport_class.c
<!-- Cannot be included, because
attribute_container_add_class_device_adapter
and attribute_container_classdev_to_container
exceed allowed 44 characters maximum
X!Edrivers/base/attribute_container.c
-->
!Edrivers/base/sys.c
<!--
X!Edrivers/base/interface.c
-->
!Edrivers/base/platform.c
!Edrivers/base/bus.c
</sect1>
<sect1><title>Device Drivers Power Management</title>
!Edrivers/base/power/main.c
</sect1>
<sect1><title>Device Drivers ACPI Support</title>
<!-- Internal functions only
X!Edrivers/acpi/sleep/main.c
X!Edrivers/acpi/sleep/wakeup.c
X!Edrivers/acpi/motherboard.c
X!Edrivers/acpi/bus.c
-->
!Edrivers/acpi/scan.c
!Idrivers/acpi/scan.c
<!-- No correct structured comments
X!Edrivers/acpi/pci_bind.c
-->
</sect1>
<sect1><title>Device drivers PnP support</title>
!Idrivers/pnp/core.c
<!-- No correct structured comments
X!Edrivers/pnp/system.c
-->
!Edrivers/pnp/card.c
!Idrivers/pnp/driver.c
!Edrivers/pnp/manager.c
!Edrivers/pnp/support.c
</sect1>
<sect1><title>Userspace IO devices</title>
!Edrivers/uio/uio.c
!Iinclude/linux/uio_driver.h
</sect1>
</chapter>
<chapter id="parportdev">
<title>Parallel Port Devices</title>
!Iinclude/linux/parport.h
!Edrivers/parport/ieee1284.c
!Edrivers/parport/share.c
!Idrivers/parport/daisy.c
</chapter>
<chapter id="message_devices">
<title>Message-based devices</title>
<sect1><title>Fusion message devices</title>
!Edrivers/message/fusion/mptbase.c
!Idrivers/message/fusion/mptbase.c
!Edrivers/message/fusion/mptscsih.c
!Idrivers/message/fusion/mptscsih.c
!Idrivers/message/fusion/mptctl.c
!Idrivers/message/fusion/mptspi.c
!Idrivers/message/fusion/mptfc.c
!Idrivers/message/fusion/mptlan.c
</sect1>
<sect1><title>I2O message devices</title>
!Iinclude/linux/i2o.h
!Idrivers/message/i2o/core.h
!Edrivers/message/i2o/iop.c
!Idrivers/message/i2o/iop.c
!Idrivers/message/i2o/config-osm.c
!Edrivers/message/i2o/exec-osm.c
!Idrivers/message/i2o/exec-osm.c
!Idrivers/message/i2o/bus-osm.c
!Edrivers/message/i2o/device.c
!Idrivers/message/i2o/device.c
!Idrivers/message/i2o/driver.c
!Idrivers/message/i2o/pci.c
!Idrivers/message/i2o/i2o_block.c
!Idrivers/message/i2o/i2o_scsi.c
!Idrivers/message/i2o/i2o_proc.c
</sect1>
</chapter>
<chapter id="snddev">
<title>Sound Devices</title>
!Iinclude/sound/core.h
!Esound/sound_core.c
!Iinclude/sound/pcm.h
!Esound/core/pcm.c
!Esound/core/device.c
!Esound/core/info.c
!Esound/core/rawmidi.c
!Esound/core/sound.c
!Esound/core/memory.c
!Esound/core/pcm_memory.c
!Esound/core/init.c
!Esound/core/isadma.c
!Esound/core/control.c
!Esound/core/pcm_lib.c
!Esound/core/hwdep.c
!Esound/core/pcm_native.c
!Esound/core/memalloc.c
<!-- FIXME: Removed for now since no structured comments in source
X!Isound/sound_firmware.c
-->
</chapter>
<chapter id="uart16x50">
<title>16x50 UART Driver</title>
!Iinclude/linux/serial_core.h
!Edrivers/serial/serial_core.c
!Edrivers/serial/8250.c
</chapter>
<chapter id="fbdev">
<title>Frame Buffer Library</title>
<para>
The frame buffer drivers depend heavily on four data structures.
These structures are declared in include/linux/fb.h. They are
fb_info, fb_var_screeninfo, fb_fix_screeninfo and fb_monospecs.
The last three can be made available to and from userland.
</para>
<para>
fb_info defines the current state of a particular video card.
Inside fb_info, there exists a fb_ops structure which is a
collection of needed functions to make fbdev and fbcon work.
fb_info is only visible to the kernel.
</para>
<para>
fb_var_screeninfo is used to describe the features of a video card
that are user defined. With fb_var_screeninfo, things such as
depth and the resolution may be defined.
</para>
<para>
The next structure is fb_fix_screeninfo. This defines the
properties of a card that are created when a mode is set and can't
be changed otherwise. A good example of this is the start of the
frame buffer memory. This "locks" the address of the frame buffer
memory, so that it cannot be changed or moved.
</para>
<para>
The last structure is fb_monospecs. In the old API, there was
little importance for fb_monospecs. This allowed for forbidden things
such as setting a mode of 800x600 on a fix frequency monitor. With
the new API, fb_monospecs prevents such things, and if used
correctly, can prevent a monitor from being cooked. fb_monospecs
will not be useful until kernels 2.5.x.
</para>
<sect1><title>Frame Buffer Memory</title>
!Edrivers/video/fbmem.c
</sect1>
<!--
<sect1><title>Frame Buffer Console</title>
X!Edrivers/video/console/fbcon.c
</sect1>
-->
<sect1><title>Frame Buffer Colormap</title>
!Edrivers/video/fbcmap.c
</sect1>
<!-- FIXME:
drivers/video/fbgen.c has no docs, which stuffs up the sgml. Comment
out until somebody adds docs. KAO
<sect1><title>Frame Buffer Generic Functions</title>
X!Idrivers/video/fbgen.c
</sect1>
KAO -->
<sect1><title>Frame Buffer Video Mode Database</title>
!Idrivers/video/modedb.c
!Edrivers/video/modedb.c
</sect1>
<sect1><title>Frame Buffer Macintosh Video Mode Database</title>
!Edrivers/video/macmodes.c
</sect1>
<sect1><title>Frame Buffer Fonts</title>
<para>
Refer to the file drivers/video/console/fonts.c for more information.
</para>
<!-- FIXME: Removed for now since no structured comments in source
X!Idrivers/video/console/fonts.c
-->
</sect1>
</chapter>
<chapter id="input_subsystem">
<title>Input Subsystem</title>
!Iinclude/linux/input.h
!Edrivers/input/input.c
!Edrivers/input/ff-core.c
!Edrivers/input/ff-memless.c
</chapter>
<chapter id="spi">
<title>Serial Peripheral Interface (SPI)</title>
<para>
SPI is the "Serial Peripheral Interface", widely used with
embedded systems because it is a simple and efficient
interface: basically a multiplexed shift register.
Its three signal wires hold a clock (SCK, often in the range
of 1-20 MHz), a "Master Out, Slave In" (MOSI) data line, and
a "Master In, Slave Out" (MISO) data line.
SPI is a full duplex protocol; for each bit shifted out the
MOSI line (one per clock) another is shifted in on the MISO line.
Those bits are assembled into words of various sizes on the
way to and from system memory.
An additional chipselect line is usually active-low (nCS);
four signals are normally used for each peripheral, plus
sometimes an interrupt.
</para>
<para>
The SPI bus facilities listed here provide a generalized
interface to declare SPI busses and devices, manage them
according to the standard Linux driver model, and perform
input/output operations.
At this time, only "master" side interfaces are supported,
where Linux talks to SPI peripherals and does not implement
such a peripheral itself.
(Interfaces to support implementing SPI slaves would
necessarily look different.)
</para>
<para>
The programming interface is structured around two kinds of driver,
and two kinds of device.
A "Controller Driver" abstracts the controller hardware, which may
be as simple as a set of GPIO pins or as complex as a pair of FIFOs
connected to dual DMA engines on the other side of the SPI shift
register (maximizing throughput). Such drivers bridge between
whatever bus they sit on (often the platform bus) and SPI, and
expose the SPI side of their device as a
<structname>struct spi_master</structname>.
SPI devices are children of that master, represented as a
<structname>struct spi_device</structname> and manufactured from
<structname>struct spi_board_info</structname> descriptors which
are usually provided by board-specific initialization code.
A <structname>struct spi_driver</structname> is called a
"Protocol Driver", and is bound to a spi_device using normal
driver model calls.
</para>
<para>
The I/O model is a set of queued messages. Protocol drivers
submit one or more <structname>struct spi_message</structname>
objects, which are processed and completed asynchronously.
(There are synchronous wrappers, however.) Messages are
built from one or more <structname>struct spi_transfer</structname>
objects, each of which wraps a full duplex SPI transfer.
A variety of protocol tweaking options are needed, because
different chips adopt very different policies for how they
use the bits transferred with SPI.
</para>
!Iinclude/linux/spi/spi.h
!Fdrivers/spi/spi.c spi_register_board_info
!Edrivers/spi/spi.c
</chapter>
<chapter id="i2c">
<title>I<superscript>2</superscript>C and SMBus Subsystem</title>
<para>
I<superscript>2</superscript>C (or without fancy typography, "I2C")
is an acronym for the "Inter-IC" bus, a simple bus protocol which is
widely used where low data rate communications suffice.
Since it's also a licensed trademark, some vendors use another
name (such as "Two-Wire Interface", TWI) for the same bus.
I2C only needs two signals (SCL for clock, SDA for data), conserving
board real estate and minimizing signal quality issues.
Most I2C devices use seven bit addresses, and bus speeds of up
to 400 kHz; there's a high speed extension (3.4 MHz) that's not yet
found wide use.
I2C is a multi-master bus; open drain signaling is used to
arbitrate between masters, as well as to handshake and to
synchronize clocks from slower clients.
</para>
<para>
The Linux I2C programming interfaces support only the master
side of bus interactions, not the slave side.
The programming interface is structured around two kinds of driver,
and two kinds of device.
An I2C "Adapter Driver" abstracts the controller hardware; it binds
to a physical device (perhaps a PCI device or platform_device) and
exposes a <structname>struct i2c_adapter</structname> representing
each I2C bus segment it manages.
On each I2C bus segment will be I2C devices represented by a
<structname>struct i2c_client</structname>. Those devices will
be bound to a <structname>struct i2c_driver</structname>,
which should follow the standard Linux driver model.
(At this writing, a legacy model is more widely used.)
There are functions to perform various I2C protocol operations; at
this writing all such functions are usable only from task context.
</para>
<para>
The System Management Bus (SMBus) is a sibling protocol. Most SMBus
systems are also I2C conformant. The electrical constraints are
tighter for SMBus, and it standardizes particular protocol messages
and idioms. Controllers that support I2C can also support most
SMBus operations, but SMBus controllers don't support all the protocol
options that an I2C controller will.
There are functions to perform various SMBus protocol operations,
either using I2C primitives or by issuing SMBus commands to
i2c_adapter devices which don't support those I2C operations.
</para>
!Iinclude/linux/i2c.h
!Fdrivers/i2c/i2c-boardinfo.c i2c_register_board_info
!Edrivers/i2c/i2c-core.c
</chapter>
</book>
......@@ -440,6 +440,7 @@ desc->chip->end();
used in the generic IRQ layer.
</para>
!Iinclude/linux/irq.h
!Iinclude/linux/interrupt.h
</chapter>
<chapter id="pubfunctions">
......
......@@ -38,58 +38,6 @@
<toc></toc>
<chapter id="Basics">
<title>Driver Basics</title>
<sect1><title>Driver Entry and Exit points</title>
!Iinclude/linux/init.h
</sect1>
<sect1><title>Atomic and pointer manipulation</title>
!Iarch/x86/include/asm/atomic_32.h
!Iarch/x86/include/asm/unaligned.h
</sect1>
<sect1><title>Delaying, scheduling, and timer routines</title>
!Iinclude/linux/sched.h
!Ekernel/sched.c
!Ekernel/timer.c
</sect1>
<sect1><title>High-resolution timers</title>
!Iinclude/linux/ktime.h
!Iinclude/linux/hrtimer.h
!Ekernel/hrtimer.c
</sect1>
<sect1><title>Workqueues and Kevents</title>
!Ekernel/workqueue.c
</sect1>
<sect1><title>Internal Functions</title>
!Ikernel/exit.c
!Ikernel/signal.c
!Iinclude/linux/kthread.h
!Ekernel/kthread.c
</sect1>
<sect1><title>Kernel objects manipulation</title>
<!--
X!Iinclude/linux/kobject.h
-->
!Elib/kobject.c
</sect1>
<sect1><title>Kernel utility functions</title>
!Iinclude/linux/kernel.h
!Ekernel/printk.c
!Ekernel/panic.c
!Ekernel/sys.c
!Ekernel/rcupdate.c
</sect1>
<sect1><title>Device Resource Management</title>
!Edrivers/base/devres.c
</sect1>
</chapter>
<chapter id="adt">
<title>Data Types</title>
<sect1><title>Doubly Linked Lists</title>
......@@ -242,15 +190,20 @@ X!Ekernel/module.c
!Edrivers/pci/pci.c
!Edrivers/pci/pci-driver.c
!Edrivers/pci/remove.c
!Edrivers/pci/pci-acpi.c
!Edrivers/pci/search.c
!Edrivers/pci/msi.c
!Edrivers/pci/bus.c
!Edrivers/pci/access.c
!Edrivers/pci/irq.c
!Edrivers/pci/htirq.c
<!-- FIXME: Removed for now since no structured comments in source
X!Edrivers/pci/hotplug.c
-->
!Edrivers/pci/probe.c
!Edrivers/pci/slot.c
!Edrivers/pci/rom.c
!Edrivers/pci/iov.c
!Idrivers/pci/pci-sysfs.c
</sect1>
<sect1><title>PCI Hotplug Support Library</title>
!Edrivers/pci/hotplug/pci_hotplug_core.c
......@@ -298,62 +251,6 @@ X!Earch/x86/kernel/mca_32.c
!Ikernel/acct.c
</chapter>
<chapter id="devdrivers">
<title>Device drivers infrastructure</title>
<sect1><title>Device Drivers Base</title>
<!--
X!Iinclude/linux/device.h
-->
!Edrivers/base/driver.c
!Edrivers/base/core.c
!Edrivers/base/class.c
!Edrivers/base/firmware_class.c
!Edrivers/base/transport_class.c
<!-- Cannot be included, because
attribute_container_add_class_device_adapter
and attribute_container_classdev_to_container
exceed allowed 44 characters maximum
X!Edrivers/base/attribute_container.c
-->
!Edrivers/base/sys.c
<!--
X!Edrivers/base/interface.c
-->
!Edrivers/base/platform.c
!Edrivers/base/bus.c
</sect1>
<sect1><title>Device Drivers Power Management</title>
!Edrivers/base/power/main.c
</sect1>
<sect1><title>Device Drivers ACPI Support</title>
<!-- Internal functions only
X!Edrivers/acpi/sleep/main.c
X!Edrivers/acpi/sleep/wakeup.c
X!Edrivers/acpi/motherboard.c
X!Edrivers/acpi/bus.c
-->
!Edrivers/acpi/scan.c
!Idrivers/acpi/scan.c
<!-- No correct structured comments
X!Edrivers/acpi/pci_bind.c
-->
</sect1>
<sect1><title>Device drivers PnP support</title>
!Idrivers/pnp/core.c
<!-- No correct structured comments
X!Edrivers/pnp/system.c
-->
!Edrivers/pnp/card.c
!Idrivers/pnp/driver.c
!Edrivers/pnp/manager.c
!Edrivers/pnp/support.c
</sect1>
<sect1><title>Userspace IO devices</title>
!Edrivers/uio/uio.c
!Iinclude/linux/uio_driver.h
</sect1>
</chapter>
<chapter id="blkdev">
<title>Block Devices</title>
!Eblock/blk-core.c
......@@ -366,7 +263,7 @@ X!Edrivers/pnp/system.c
!Eblock/blk-tag.c
!Iblock/blk-tag.c
!Eblock/blk-integrity.c
!Iblock/blktrace.c
!Ikernel/trace/blktrace.c
!Iblock/genhd.c
!Eblock/genhd.c
</chapter>
......@@ -381,275 +278,6 @@ X!Edrivers/pnp/system.c
!Edrivers/char/misc.c
</chapter>
<chapter id="parportdev">
<title>Parallel Port Devices</title>
!Iinclude/linux/parport.h
!Edrivers/parport/ieee1284.c
!Edrivers/parport/share.c
!Idrivers/parport/daisy.c
</chapter>
<chapter id="message_devices">
<title>Message-based devices</title>
<sect1><title>Fusion message devices</title>
!Edrivers/message/fusion/mptbase.c
!Idrivers/message/fusion/mptbase.c
!Edrivers/message/fusion/mptscsih.c
!Idrivers/message/fusion/mptscsih.c
!Idrivers/message/fusion/mptctl.c
!Idrivers/message/fusion/mptspi.c
!Idrivers/message/fusion/mptfc.c
!Idrivers/message/fusion/mptlan.c
</sect1>
<sect1><title>I2O message devices</title>
!Iinclude/linux/i2o.h
!Idrivers/message/i2o/core.h
!Edrivers/message/i2o/iop.c
!Idrivers/message/i2o/iop.c
!Idrivers/message/i2o/config-osm.c
!Edrivers/message/i2o/exec-osm.c
!Idrivers/message/i2o/exec-osm.c
!Idrivers/message/i2o/bus-osm.c
!Edrivers/message/i2o/device.c
!Idrivers/message/i2o/device.c
!Idrivers/message/i2o/driver.c
!Idrivers/message/i2o/pci.c
!Idrivers/message/i2o/i2o_block.c
!Idrivers/message/i2o/i2o_scsi.c
!Idrivers/message/i2o/i2o_proc.c
</sect1>
</chapter>
<chapter id="snddev">
<title>Sound Devices</title>
!Iinclude/sound/core.h
!Esound/sound_core.c
!Iinclude/sound/pcm.h
!Esound/core/pcm.c
!Esound/core/device.c
!Esound/core/info.c
!Esound/core/rawmidi.c
!Esound/core/sound.c
!Esound/core/memory.c
!Esound/core/pcm_memory.c
!Esound/core/init.c
!Esound/core/isadma.c
!Esound/core/control.c
!Esound/core/pcm_lib.c
!Esound/core/hwdep.c
!Esound/core/pcm_native.c
!Esound/core/memalloc.c
<!-- FIXME: Removed for now since no structured comments in source
X!Isound/sound_firmware.c
-->
</chapter>
<chapter id="uart16x50">
<title>16x50 UART Driver</title>
!Iinclude/linux/serial_core.h
!Edrivers/serial/serial_core.c
!Edrivers/serial/8250.c
</chapter>
<chapter id="fbdev">
<title>Frame Buffer Library</title>
<para>
The frame buffer drivers depend heavily on four data structures.
These structures are declared in include/linux/fb.h. They are
fb_info, fb_var_screeninfo, fb_fix_screeninfo and fb_monospecs.
The last three can be made available to and from userland.
</para>
<para>
fb_info defines the current state of a particular video card.
Inside fb_info, there exists a fb_ops structure which is a
collection of needed functions to make fbdev and fbcon work.
fb_info is only visible to the kernel.
</para>
<para>
fb_var_screeninfo is used to describe the features of a video card
that are user defined. With fb_var_screeninfo, things such as
depth and the resolution may be defined.
</para>
<para>
The next structure is fb_fix_screeninfo. This defines the
properties of a card that are created when a mode is set and can't
be changed otherwise. A good example of this is the start of the
frame buffer memory. This "locks" the address of the frame buffer
memory, so that it cannot be changed or moved.
</para>
<para>
The last structure is fb_monospecs. In the old API, there was
little importance for fb_monospecs. This allowed for forbidden things
such as setting a mode of 800x600 on a fix frequency monitor. With
the new API, fb_monospecs prevents such things, and if used
correctly, can prevent a monitor from being cooked. fb_monospecs
will not be useful until kernels 2.5.x.
</para>
<sect1><title>Frame Buffer Memory</title>
!Edrivers/video/fbmem.c
</sect1>
<!--
<sect1><title>Frame Buffer Console</title>
X!Edrivers/video/console/fbcon.c
</sect1>
-->
<sect1><title>Frame Buffer Colormap</title>
!Edrivers/video/fbcmap.c
</sect1>
<!-- FIXME:
drivers/video/fbgen.c has no docs, which stuffs up the sgml. Comment
out until somebody adds docs. KAO
<sect1><title>Frame Buffer Generic Functions</title>
X!Idrivers/video/fbgen.c
</sect1>
KAO -->
<sect1><title>Frame Buffer Video Mode Database</title>
!Idrivers/video/modedb.c
!Edrivers/video/modedb.c
</sect1>
<sect1><title>Frame Buffer Macintosh Video Mode Database</title>
!Edrivers/video/macmodes.c
</sect1>
<sect1><title>Frame Buffer Fonts</title>
<para>
Refer to the file drivers/video/console/fonts.c for more information.
</para>
<!-- FIXME: Removed for now since no structured comments in source
X!Idrivers/video/console/fonts.c
-->
</sect1>
</chapter>
<chapter id="input_subsystem">
<title>Input Subsystem</title>
!Iinclude/linux/input.h
!Edrivers/input/input.c
!Edrivers/input/ff-core.c
!Edrivers/input/ff-memless.c
</chapter>
<chapter id="spi">
<title>Serial Peripheral Interface (SPI)</title>
<para>
SPI is the "Serial Peripheral Interface", widely used with
embedded systems because it is a simple and efficient
interface: basically a multiplexed shift register.
Its three signal wires hold a clock (SCK, often in the range
of 1-20 MHz), a "Master Out, Slave In" (MOSI) data line, and
a "Master In, Slave Out" (MISO) data line.
SPI is a full duplex protocol; for each bit shifted out the
MOSI line (one per clock) another is shifted in on the MISO line.
Those bits are assembled into words of various sizes on the
way to and from system memory.
An additional chipselect line is usually active-low (nCS);
four signals are normally used for each peripheral, plus
sometimes an interrupt.
</para>
<para>
The SPI bus facilities listed here provide a generalized
interface to declare SPI busses and devices, manage them
according to the standard Linux driver model, and perform
input/output operations.
At this time, only "master" side interfaces are supported,
where Linux talks to SPI peripherals and does not implement
such a peripheral itself.
(Interfaces to support implementing SPI slaves would
necessarily look different.)
</para>
<para>
The programming interface is structured around two kinds of driver,
and two kinds of device.
A "Controller Driver" abstracts the controller hardware, which may
be as simple as a set of GPIO pins or as complex as a pair of FIFOs
connected to dual DMA engines on the other side of the SPI shift
register (maximizing throughput). Such drivers bridge between
whatever bus they sit on (often the platform bus) and SPI, and
expose the SPI side of their device as a
<structname>struct spi_master</structname>.
SPI devices are children of that master, represented as a
<structname>struct spi_device</structname> and manufactured from
<structname>struct spi_board_info</structname> descriptors which
are usually provided by board-specific initialization code.
A <structname>struct spi_driver</structname> is called a
"Protocol Driver", and is bound to a spi_device using normal
driver model calls.
</para>
<para>
The I/O model is a set of queued messages. Protocol drivers
submit one or more <structname>struct spi_message</structname>
objects, which are processed and completed asynchronously.
(There are synchronous wrappers, however.) Messages are
built from one or more <structname>struct spi_transfer</structname>
objects, each of which wraps a full duplex SPI transfer.
A variety of protocol tweaking options are needed, because
different chips adopt very different policies for how they
use the bits transferred with SPI.
</para>
!Iinclude/linux/spi/spi.h
!Fdrivers/spi/spi.c spi_register_board_info
!Edrivers/spi/spi.c
</chapter>
<chapter id="i2c">
<title>I<superscript>2</superscript>C and SMBus Subsystem</title>
<para>
I<superscript>2</superscript>C (or without fancy typography, "I2C")
is an acronym for the "Inter-IC" bus, a simple bus protocol which is
widely used where low data rate communications suffice.
Since it's also a licensed trademark, some vendors use another
name (such as "Two-Wire Interface", TWI) for the same bus.
I2C only needs two signals (SCL for clock, SDA for data), conserving
board real estate and minimizing signal quality issues.
Most I2C devices use seven bit addresses, and bus speeds of up
to 400 kHz; there's a high speed extension (3.4 MHz) that's not yet
found wide use.
I2C is a multi-master bus; open drain signaling is used to
arbitrate between masters, as well as to handshake and to
synchronize clocks from slower clients.
</para>
<para>
The Linux I2C programming interfaces support only the master
side of bus interactions, not the slave side.
The programming interface is structured around two kinds of driver,
and two kinds of device.
An I2C "Adapter Driver" abstracts the controller hardware; it binds
to a physical device (perhaps a PCI device or platform_device) and
exposes a <structname>struct i2c_adapter</structname> representing
each I2C bus segment it manages.
On each I2C bus segment will be I2C devices represented by a
<structname>struct i2c_client</structname>. Those devices will
be bound to a <structname>struct i2c_driver</structname>,
which should follow the standard Linux driver model.
(At this writing, a legacy model is more widely used.)
There are functions to perform various I2C protocol operations; at
this writing all such functions are usable only from task context.
</para>
<para>
The System Management Bus (SMBus) is a sibling protocol. Most SMBus
systems are also I2C conformant. The electrical constraints are
tighter for SMBus, and it standardizes particular protocol messages
and idioms. Controllers that support I2C can also support most
SMBus operations, but SMBus controllers don't support all the protocol
options that an I2C controller will.
There are functions to perform various SMBus protocol operations,
either using I2C primitives or by issuing SMBus commands to
i2c_adapter devices which don't support those I2C operations.
</para>
!Iinclude/linux/i2c.h
!Fdrivers/i2c/i2c-boardinfo.c i2c_register_board_info
!Edrivers/i2c/i2c-core.c
</chapter>
<chapter id="clk">
<title>Clock Framework</title>
......
......@@ -17,8 +17,7 @@
</authorgroup>
<copyright>
<year>2007</year>
<year>2008</year>
<year>2007-2009</year>
<holder>Johannes Berg</holder>
</copyright>
......@@ -165,8 +164,8 @@ usage should require reading the full document.
!Pinclude/net/mac80211.h Frame format
</sect1>
<sect1>
<title>Alignment issues</title>
<para>TBD</para>
<title>Packet alignment</title>
!Pnet/mac80211/rx.c Packet alignment
</sect1>
<sect1>
<title>Calling into mac80211 from interrupts</title>
......@@ -223,6 +222,17 @@ usage should require reading the full document.
!Finclude/net/mac80211.h ieee80211_key_flags
</chapter>
<chapter id="powersave">
<title>Powersave support</title>
!Pinclude/net/mac80211.h Powersave support
</chapter>
<chapter id="beacon-filter">
<title>Beacon filter support</title>
!Pinclude/net/mac80211.h Beacon filter support
!Finclude/net/mac80211.h ieee80211_beacon_loss
</chapter>
<chapter id="qos">
<title>Multiple queues and QoS support</title>
<para>TBD</para>
......
......@@ -117,9 +117,6 @@ static int __init init_procfs_example(void)
rv = -ENOMEM;
goto out;
}
example_dir->owner = THIS_MODULE;
/* create jiffies using convenience function */
jiffies_file = create_proc_read_entry("jiffies",
0444, example_dir,
......@@ -130,8 +127,6 @@ static int __init init_procfs_example(void)
goto no_jiffies;
}
jiffies_file->owner = THIS_MODULE;
/* create foo and bar files using same callback
* functions
*/
......@@ -146,7 +141,6 @@ static int __init init_procfs_example(void)
foo_file->data = &foo_data;
foo_file->read_proc = proc_read_foobar;
foo_file->write_proc = proc_write_foobar;
foo_file->owner = THIS_MODULE;
bar_file = create_proc_entry("bar", 0644, example_dir);
if(bar_file == NULL) {
......@@ -159,7 +153,6 @@ static int __init init_procfs_example(void)
bar_file->data = &bar_data;
bar_file->read_proc = proc_read_foobar;
bar_file->write_proc = proc_write_foobar;
bar_file->owner = THIS_MODULE;
/* create symlink */
symlink = proc_symlink("jiffies_too", example_dir,
......@@ -169,8 +162,6 @@ static int __init init_procfs_example(void)
goto no_symlink;
}
symlink->owner = THIS_MODULE;
/* everything OK */
printk(KERN_INFO "%s %s initialised\n",
MODULE_NAME, MODULE_VERS);
......
......@@ -41,6 +41,19 @@ GPL version 2.
</abstract>
<revhistory>
<revision>
<revnumber>0.8</revnumber>
<date>2008-12-24</date>
<authorinitials>hjk</authorinitials>
<revremark>Added name attributes in mem and portio sysfs directories.
</revremark>
</revision>
<revision>
<revnumber>0.7</revnumber>
<date>2008-12-23</date>
<authorinitials>hjk</authorinitials>
<revremark>Added generic platform drivers and offset attribute.</revremark>
</revision>
<revision>
<revnumber>0.6</revnumber>
<date>2008-12-05</date>
......@@ -297,10 +310,17 @@ interested in translating it, please email me
appear if the size of the mapping is not 0.
</para>
<para>
Each <filename>mapX/</filename> directory contains two read-only files
that show start address and size of the memory:
Each <filename>mapX/</filename> directory contains four read-only files
that show attributes of the memory:
</para>
<itemizedlist>
<listitem>
<para>
<filename>name</filename>: A string identifier for this mapping. This
is optional, the string can be empty. Drivers can set this to make it
easier for userspace to find the correct mapping.
</para>
</listitem>
<listitem>
<para>
<filename>addr</filename>: The address of memory that can be mapped.
......@@ -312,6 +332,16 @@ interested in translating it, please email me
pointed to by addr.
</para>
</listitem>
<listitem>
<para>
<filename>offset</filename>: The offset, in bytes, that has to be
added to the pointer returned by <function>mmap()</function> to get
to the actual device memory. This is important if the device's memory
is not page aligned. Remember that pointers returned by
<function>mmap()</function> are always page aligned, so it is good
style to always add this offset.
</para>
</listitem>
</itemizedlist>
<para>
......@@ -350,10 +380,17 @@ offset = N * getpagesize();
<filename>/sys/class/uio/uioX/portio/</filename>.
</para>
<para>
Each <filename>portX/</filename> directory contains three read-only
files that show start, size, and type of the port region:
Each <filename>portX/</filename> directory contains four read-only
files that show name, start, size, and type of the port region:
</para>
<itemizedlist>
<listitem>
<para>
<filename>name</filename>: A string identifier for this port region.
The string is optional and can be empty. Drivers can set it to make it
easier for userspace to find a certain port region.
</para>
</listitem>
<listitem>
<para>
<filename>start</filename>: The first port of this region.
......@@ -594,6 +631,78 @@ framework to set up sysfs files for this region. Simply leave it alone.
</para>
</sect1>
<sect1 id="using_uio_pdrv">
<title>Using uio_pdrv for platform devices</title>
<para>
In many cases, UIO drivers for platform devices can be handled in a
generic way. In the same place where you define your
<varname>struct platform_device</varname>, you simply also implement
your interrupt handler and fill your
<varname>struct uio_info</varname>. A pointer to this
<varname>struct uio_info</varname> is then used as
<varname>platform_data</varname> for your platform device.
</para>
<para>
You also need to set up an array of <varname>struct resource</varname>
containing addresses and sizes of your memory mappings. This
information is passed to the driver using the
<varname>.resource</varname> and <varname>.num_resources</varname>
elements of <varname>struct platform_device</varname>.
</para>
<para>
You now have to set the <varname>.name</varname> element of
<varname>struct platform_device</varname> to
<varname>"uio_pdrv"</varname> to use the generic UIO platform device
driver. This driver will fill the <varname>mem[]</varname> array
according to the resources given, and register the device.
</para>
<para>
The advantage of this approach is that you only have to edit a file
you need to edit anyway. You do not have to create an extra driver.
</para>
</sect1>
<sect1 id="using_uio_pdrv_genirq">
<title>Using uio_pdrv_genirq for platform devices</title>
<para>
Especially in embedded devices, you frequently find chips where the
irq pin is tied to its own dedicated interrupt line. In such cases,
where you can be really sure the interrupt is not shared, we can take
the concept of <varname>uio_pdrv</varname> one step further and use a
generic interrupt handler. That's what
<varname>uio_pdrv_genirq</varname> does.
</para>
<para>
The setup for this driver is the same as described above for
<varname>uio_pdrv</varname>, except that you do not implement an
interrupt handler. The <varname>.handler</varname> element of
<varname>struct uio_info</varname> must remain
<varname>NULL</varname>. The <varname>.irq_flags</varname> element
must not contain <varname>IRQF_SHARED</varname>.
</para>
<para>
You will set the <varname>.name</varname> element of
<varname>struct platform_device</varname> to
<varname>"uio_pdrv_genirq"</varname> to use this driver.
</para>
<para>
The generic interrupt handler of <varname>uio_pdrv_genirq</varname>
will simply disable the interrupt line using
<function>disable_irq_nosync()</function>. After doing its work,
userspace can reenable the interrupt by writing 0x00000001 to the UIO
device file. The driver already implements an
<function>irq_control()</function> to make this possible, you must not
implement your own.
</para>
<para>
Using <varname>uio_pdrv_genirq</varname> not only saves a few lines of
interrupt handler code. You also do not need to know anything about
the chip's internal registers to create the kernel part of the driver.
All you need to know is the irq number of the pin the chip is
connected to.
</para>
</sect1>
</chapter>
<chapter id="userspace_driver" xreflabel="Writing a driver in user space">
......
此差异已折叠。
[ NOTE: The virt_to_bus() and bus_to_virt() functions have been
superseded by the functionality provided by the PCI DMA
interface (see Documentation/DMA-mapping.txt). They continue
superseded by the functionality provided by the PCI DMA interface
(see Documentation/PCI/PCI-DMA-mapping.txt). They continue
to be documented below for historical purposes, but new code
must not use them. --davidm 00/12/12 ]
......
此差异已折叠。
......@@ -93,7 +93,7 @@ the PCI Express Port Bus driver from loading a service driver.
int pcie_port_service_register(struct pcie_port_service_driver *new)
This API replaces the Linux Driver Model's pci_module_init API. A
This API replaces the Linux Driver Model's pci_register_driver API. A
service driver should always calls pcie_port_service_register at
module init. Note that after service driver being loaded, calls
such as pci_enable_device(dev) and pci_set_master(dev) are no longer
......
PCI Express I/O Virtualization Howto
Copyright (C) 2009 Intel Corporation
Yu Zhao <yu.zhao@intel.com>
1. Overview
1.1 What is SR-IOV
Single Root I/O Virtualization (SR-IOV) is a PCI Express Extended
capability which makes one physical device appear as multiple virtual
devices. The physical device is referred to as Physical Function (PF)
while the virtual devices are referred to as Virtual Functions (VF).
Allocation of the VF can be dynamically controlled by the PF via
registers encapsulated in the capability. By default, this feature is
not enabled and the PF behaves as traditional PCIe device. Once it's
turned on, each VF's PCI configuration space can be accessed by its own
Bus, Device and Function Number (Routing ID). And each VF also has PCI
Memory Space, which is used to map its register set. VF device driver
operates on the register set so it can be functional and appear as a
real existing PCI device.
2. User Guide
2.1 How can I enable SR-IOV capability
The device driver (PF driver) will control the enabling and disabling
of the capability via API provided by SR-IOV core. If the hardware
has SR-IOV capability, loading its PF driver would enable it and all
VFs associated with the PF.
2.2 How can I use the Virtual Functions
The VF is treated as hot-plugged PCI devices in the kernel, so they
should be able to work in the same way as real PCI devices. The VF
requires device driver that is same as a normal PCI device's.
3. Developer Guide
3.1 SR-IOV API
To enable SR-IOV capability:
int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
'nr_virtfn' is number of VFs to be enabled.
To disable SR-IOV capability:
void pci_disable_sriov(struct pci_dev *dev);
To notify SR-IOV core of Virtual Function Migration:
irqreturn_t pci_sriov_migration(struct pci_dev *dev);
3.2 Usage example
Following piece of code illustrates the usage of the SR-IOV API.
static int __devinit dev_probe(struct pci_dev *dev, const struct pci_device_id *id)
{
pci_enable_sriov(dev, NR_VIRTFN);
...
return 0;
}
static void __devexit dev_remove(struct pci_dev *dev)
{
pci_disable_sriov(dev);
...
}
static int dev_suspend(struct pci_dev *dev, pm_message_t state)
{
...
return 0;
}
static int dev_resume(struct pci_dev *dev)
{
...
return 0;
}
static void dev_shutdown(struct pci_dev *dev)
{
...
}
static struct pci_driver dev_driver = {
.name = "SR-IOV Physical Function driver",
.id_table = dev_id_table,
.probe = dev_probe,
.remove = __devexit_p(dev_remove),
.suspend = dev_suspend,
.resume = dev_resume,
.shutdown = dev_shutdown,
};
......@@ -298,3 +298,15 @@ over a rather long period of time, but improvements are always welcome!
Note that, rcu_assign_pointer() and rcu_dereference() relate to
SRCU just as they do to other forms of RCU.
15. The whole point of call_rcu(), synchronize_rcu(), and friends
is to wait until all pre-existing readers have finished before
carrying out some otherwise-destructive operation. It is
therefore critically important to -first- remove any path
that readers can follow that could be affected by the
destructive operation, and -only- -then- invoke call_rcu(),
synchronize_rcu(), or friends.
Because these primitives only wait for pre-existing readers,
it is the caller's responsibility to guarantee safety to
any subsequent readers.
......@@ -118,7 +118,7 @@ Following are the RCU equivalents for these two functions:
list_for_each_entry(e, list, list) {
if (!audit_compare_rule(rule, &e->rule)) {
list_del_rcu(&e->list);
call_rcu(&e->rcu, audit_free_rule, e);
call_rcu(&e->rcu, audit_free_rule);
return 0;
}
}
......@@ -206,7 +206,7 @@ RCU ("read-copy update") its name. The RCU code is as follows:
ne->rule.action = newaction;
ne->rule.file_count = newfield_count;
list_replace_rcu(e, ne);
call_rcu(&e->rcu, audit_free_rule, e);
call_rcu(&e->rcu, audit_free_rule);
return 0;
}
}
......@@ -283,7 +283,7 @@ flag under the spinlock as follows:
list_del_rcu(&e->list);
e->deleted = 1;
spin_unlock(&e->lock);
call_rcu(&e->rcu, audit_free_rule, e);
call_rcu(&e->rcu, audit_free_rule);
return 0;
}
}
......
......@@ -81,7 +81,7 @@ o I hear that RCU needs work in order to support realtime kernels?
This work is largely completed. Realtime-friendly RCU can be
enabled via the CONFIG_PREEMPT_RCU kernel configuration parameter.
However, work is in progress for enabling priority boosting of
preempted RCU read-side critical sections.This is needed if you
preempted RCU read-side critical sections. This is needed if you
have CPU-bound realtime threads.
o Where can I find more information on RCU?
......
......@@ -21,7 +21,7 @@ if (obj) {
/*
* Because a writer could delete object, and a writer could
* reuse these object before the RCU grace period, we
* must check key after geting the reference on object
* must check key after getting the reference on object
*/
if (obj->key != key) { // not the object we expected
put_ref(obj);
......@@ -117,7 +117,7 @@ a race (some writer did a delete and/or a move of an object
to another chain) checking the final 'nulls' value if
the lookup met the end of chain. If final 'nulls' value
is not the slot number, then we must restart the lookup at
the begining. If the object was moved to same chain,
the beginning. If the object was moved to the same chain,
then the reader doesnt care : It might eventually
scan the list again without harm.
......
......@@ -184,14 +184,16 @@ length. Single character labels using special characters, that being anything
other than a letter or digit, are reserved for use by the Smack development
team. Smack labels are unstructured, case sensitive, and the only operation
ever performed on them is comparison for equality. Smack labels cannot
contain unprintable characters or the "/" (slash) character.
contain unprintable characters or the "/" (slash) character. Smack labels
cannot begin with a '-', which is reserved for special options.
There are some predefined labels:
_ Pronounced "floor", a single underscore character.
^ Pronounced "hat", a single circumflex character.
* Pronounced "star", a single asterisk character.
? Pronounced "huh", a single question mark character.
_ Pronounced "floor", a single underscore character.
^ Pronounced "hat", a single circumflex character.
* Pronounced "star", a single asterisk character.
? Pronounced "huh", a single question mark character.
@ Pronounced "Internet", a single at sign character.
Every task on a Smack system is assigned a label. System tasks, such as
init(8) and systems daemons, are run with the floor ("_") label. User tasks
......@@ -412,6 +414,36 @@ sockets.
A privileged program may set this to match the label of another
task with which it hopes to communicate.
Smack Netlabel Exceptions
You will often find that your labeled application has to talk to the outside,
unlabeled world. To do this there's a special file /smack/netlabel where you can
add some exceptions in the form of :
@IP1 LABEL1 or
@IP2/MASK LABEL2
It means that your application will have unlabeled access to @IP1 if it has
write access on LABEL1, and access to the subnet @IP2/MASK if it has write
access on LABEL2.
Entries in the /smack/netlabel file are matched by longest mask first, like in
classless IPv4 routing.
A special label '@' and an option '-CIPSO' can be used there :
@ means Internet, any application with any label has access to it
-CIPSO means standard CIPSO networking
If you don't know what CIPSO is and don't plan to use it, you can just do :
echo 127.0.0.1 -CIPSO > /smack/netlabel
echo 0.0.0.0/0 @ > /smack/netlabel
If you use CIPSO on your 192.168.0.0/16 local network and need also unlabeled
Internet access, you can have :
echo 127.0.0.1 -CIPSO > /smack/netlabel
echo 192.168.0.0/16 -CIPSO > /smack/netlabel
echo 0.0.0.0/0 @ > /smack/netlabel
Writing Applications for Smack
There are three sorts of applications that will run on a Smack system. How an
......
......@@ -392,6 +392,10 @@ int main(int argc, char *argv[])
goto err;
}
}
if (!maskset && !tid && !containerset) {
usage();
goto err;
}
do {
int i;
......
......@@ -40,13 +40,13 @@ Resuming
Machine Support
---------------
The machine specific functions must call the s3c2410_pm_init() function
The machine specific functions must call the s3c_pm_init() function
to say that its bootloader is capable of resuming. This can be as
simple as adding the following to the machine's definition:
INITMACHINE(s3c2410_pm_init)
INITMACHINE(s3c_pm_init)
A board can do its own setup before calling s3c2410_pm_init, if it
A board can do its own setup before calling s3c_pm_init, if it
needs to setup anything else for power management support.
There is currently no support for over-riding the default method of
......@@ -74,7 +74,7 @@ statuc void __init machine_init(void)
enable_irq_wake(IRQ_EINT0);
s3c2410_pm_init();
s3c_pm_init();
}
......
......@@ -29,7 +29,14 @@ ffff0000 ffff0fff CPU vector page.
CPU supports vector relocation (control
register V bit.)
ffc00000 fffeffff DMA memory mapping region. Memory returned
fffe0000 fffeffff XScale cache flush area. This is used
in proc-xscale.S to flush the whole data
cache. Free for other usage on non-XScale.
fff00000 fffdffff Fixmap mapping region. Addresses provided
by fix_to_virt() will be located here.
ffc00000 ffefffff DMA memory mapping region. Memory returned
by the dma_alloc_xxx functions will be
dynamically mapped here.
......
......@@ -186,8 +186,9 @@ a virtual address mapping (unlike the earlier scheme of virtual address
do not have a corresponding kernel virtual address space mapping) and
low-memory pages.
Note: Please refer to DMA-mapping.txt for a discussion on PCI high mem DMA
aspects and mapping of scatter gather lists, and support for 64 bit PCI.
Note: Please refer to Documentation/PCI/PCI-DMA-mapping.txt for a discussion
on PCI high mem DMA aspects and mapping of scatter gather lists, and support
for 64 bit PCI.
Special handling is required only for cases where i/o needs to happen on
pages at physical memory addresses beyond what the device can support. In these
......@@ -953,14 +954,14 @@ elevator_allow_merge_fn called whenever the block layer determines
results in some sort of conflict internally,
this hook allows it to do that.
elevator_dispatch_fn fills the dispatch queue with ready requests.
elevator_dispatch_fn* fills the dispatch queue with ready requests.
I/O schedulers are free to postpone requests by
not filling the dispatch queue unless @force
is non-zero. Once dispatched, I/O schedulers
are not allowed to manipulate the requests -
they belong to generic dispatch queue.
elevator_add_req_fn called to add a new request into the scheduler
elevator_add_req_fn* called to add a new request into the scheduler
elevator_queue_empty_fn returns true if the merge queue is empty.
Drivers shouldn't use this, but rather check
......@@ -990,7 +991,7 @@ elevator_activate_req_fn Called when device driver first sees a request.
elevator_deactivate_req_fn Called when device driver decides to delay
a request by requeueing it.
elevator_init_fn
elevator_init_fn*
elevator_exit_fn Allocate and free any elevator specific storage
for a queue.
......@@ -1039,23 +1040,21 @@ Front merges are handled by the binary trees in AS and deadline schedulers.
iii. Plugging the queue to batch requests in anticipation of opportunities for
merge/sort optimizations
This is just the same as in 2.4 so far, though per-device unplugging
support is anticipated for 2.5. Also with a priority-based i/o scheduler,
such decisions could be based on request priorities.
Plugging is an approach that the current i/o scheduling algorithm resorts to so
that it collects up enough requests in the queue to be able to take
advantage of the sorting/merging logic in the elevator. If the
queue is empty when a request comes in, then it plugs the request queue
(sort of like plugging the bottom of a vessel to get fluid to build up)
(sort of like plugging the bath tub of a vessel to get fluid to build up)
till it fills up with a few more requests, before starting to service
the requests. This provides an opportunity to merge/sort the requests before
passing them down to the device. There are various conditions when the queue is
unplugged (to open up the flow again), either through a scheduled task or
could be on demand. For example wait_on_buffer sets the unplugging going
(by running tq_disk) so the read gets satisfied soon. So in the read case,
the queue gets explicitly unplugged as part of waiting for completion,
in fact all queues get unplugged as a side-effect.
through sync_buffer() running blk_run_address_space(mapping). Or the caller
can do it explicity through blk_unplug(bdev). So in the read case,
the queue gets explicitly unplugged as part of waiting for completion on that
buffer. For page driven IO, the address space ->sync_page() takes care of
doing the blk_run_address_space().
Aside:
This is kind of controversial territory, as it's not clear if plugging is
......@@ -1066,11 +1065,6 @@ Aside:
multi-page bios being queued in one shot, we may not need to wait to merge
a big request from the broken up pieces coming by.
Per-queue granularity unplugging (still a Todo) may help reduce some of the
concerns with just a single tq_disk flush approach. Something like
blk_kick_queue() to unplug a specific queue (right away ?)
or optionally, all queues, is in the plan.
4.4 I/O contexts
I/O contexts provide a dynamically allocated per process data area. They may
be used in I/O schedulers, and in the block layer (could be used for IO statis,
......
Queue sysfs files
=================
This text file will detail the queue files that are located in the sysfs tree
for each block device. Note that stacked devices typically do not export
any settings, since their queue merely functions are a remapping target.
These files are the ones found in the /sys/block/xxx/queue/ directory.
Files denoted with a RO postfix are readonly and the RW postfix means
read-write.
hw_sector_size (RO)
-------------------
This is the hardware sector size of the device, in bytes.
max_hw_sectors_kb (RO)
----------------------
This is the maximum number of kilobytes supported in a single data transfer.
max_sectors_kb (RW)
-------------------
This is the maximum number of kilobytes that the block layer will allow
for a filesystem request. Must be smaller than or equal to the maximum
size allowed by the hardware.
nomerges (RW)
-------------
This enables the user to disable the lookup logic involved with IO merging
requests in the block layer. Merging may still occur through a direct
1-hit cache, since that comes for (almost) free. The IO scheduler will not
waste cycles doing tree/hash lookups for merges if nomerges is 1. Defaults
to 0, enabling all merges.
nr_requests (RW)
----------------
This controls how many requests may be allocated in the block layer for
read or write requests. Note that the total allocated number may be twice
this amount, since it applies only to reads or writes (not the accumulated
sum).
read_ahead_kb (RW)
------------------
Maximum number of kilobytes to read-ahead for filesystems on this block
device.
rq_affinity (RW)
----------------
If this option is enabled, the block layer will migrate request completions
to the CPU that originally submitted the request. For some workloads
this provides a significant reduction in CPU cycles due to caching effects.
scheduler (RW)
--------------
When read, this file will display the current and available IO schedulers
for this block device. The currently active IO scheduler will be enclosed
in [] brackets. Writing an IO scheduler name to this file will switch
control of this block device to that new IO scheduler. Note that writing
an IO scheduler name to this file will attempt to load that IO scheduler
module, if it isn't already present in the system.
Jens Axboe <jens.axboe@oracle.com>, February 2009
......@@ -35,9 +35,3 @@ noop anticipatory deadline [cfq]
# echo anticipatory > /sys/block/hda/queue/scheduler
# cat /sys/block/hda/queue/scheduler
noop [anticipatory] deadline cfq
Each io queue has a set of io scheduler tunables associated with it. These
tunables control how the io scheduler works. You can find these entries
in:
/sys/block/<device>/queue/iosched
......@@ -8,6 +8,8 @@ cpqarray.txt
- info on using Compaq's SMART2 Intelligent Disk Array Controllers.
floppy.txt
- notes and driver options for the floppy disk driver.
mflash.txt
- info on mGine m(g)flash driver for linux.
nbd.txt
- info on a TCP implementation of a network block device.
paride.txt
......
This document describes m[g]flash support in linux.
Contents
1. Overview
2. Reserved area configuration
3. Example of mflash platform driver registration
1. Overview
Mflash and gflash are embedded flash drive. The only difference is mflash is
MCP(Multi Chip Package) device. These two device operate exactly same way.
So the rest mflash repersents mflash and gflash altogether.
Internally, mflash has nand flash and other hardware logics and supports
2 different operation (ATA, IO) modes. ATA mode doesn't need any new
driver and currently works well under standard IDE subsystem. Actually it's
one chip SSD. IO mode is ATA-like custom mode for the host that doesn't have
IDE interface.
Followings are brief descriptions about IO mode.
A. IO mode based on ATA protocol and uses some custom command. (read confirm,
write confirm)
B. IO mode uses SRAM bus interface.
C. IO mode supports 4kB boot area, so host can boot from mflash.
2. Reserved area configuration
If host boot from mflash, usually needs raw area for boot loader image. All of
the mflash's block device operation will be taken this value as start offset.
Note that boot loader's size of reserved area and kernel configuration value
must be same.
3. Example of mflash platform driver registration
Working mflash is very straight forward. Adding platform device stuff to board
configuration file is all. Here is some pseudo example.
static struct mg_drv_data mflash_drv_data = {
/* If you want to polling driver set to 1 */
.use_polling = 0,
/* device attribution */
.dev_attr = MG_BOOT_DEV
};
static struct resource mg_mflash_rsc[] = {
/* Base address of mflash */
[0] = {
.start = 0x08000000,
.end = 0x08000000 + SZ_64K - 1,
.flags = IORESOURCE_MEM
},
/* mflash interrupt pin */
[1] = {
.start = IRQ_GPIO(84),
.end = IRQ_GPIO(84),
.flags = IORESOURCE_IRQ
},
/* mflash reset pin */
[2] = {
.start = 43,
.end = 43,
.name = MG_RST_PIN,
.flags = IORESOURCE_IO
},
/* mflash reset-out pin
* If you use mflash as storage device (i.e. other than MG_BOOT_DEV),
* should assign this */
[3] = {
.start = 51,
.end = 51,
.name = MG_RSTOUT_PIN,
.flags = IORESOURCE_IO
}
};
static struct platform_device mflash_dev = {
.name = MG_DEV_NAME,
.id = -1,
.dev = {
.platform_data = &mflash_drv_data,
},
.num_resources = ARRAY_SIZE(mg_mflash_rsc),
.resource = mg_mflash_rsc
};
platform_device_register(&mflash_dev);
00-INDEX
- this file
cgroups.txt
- Control Groups definition, implementation details, examples and API.
cpuacct.txt
- CPU Accounting Controller; account CPU usage for groups of tasks.
cpusets.txt
- documents the cpusets feature; assign CPUs and Mem to a set of tasks.
devices.txt
- Device Whitelist Controller; description, interface and security.
freezer-subsystem.txt
- checkpointing; rationale to not use signals, interface.
memcg_test.txt
- Memory Resource Controller; implementation details.
memory.txt
- Memory Resource Controller; design, accounting, interface, testing.
resource_counter.txt
- Resource Counter API.
CGROUPS
-------
Written by Paul Menage <menage@google.com> based on Documentation/cpusets.txt
Written by Paul Menage <menage@google.com> based on
Documentation/cgroups/cpusets.txt
Original copyright statements from cpusets.txt:
Portions Copyright (C) 2004 BULL SA.
......@@ -55,7 +56,7 @@ hierarchy, and a set of subsystems; each subsystem has system-specific
state attached to each cgroup in the hierarchy. Each hierarchy has
an instance of the cgroup virtual filesystem associated with it.
At any one time there may be multiple active hierachies of task
At any one time there may be multiple active hierarchies of task
cgroups. Each hierarchy is a partition of all tasks in the system.
User level code may create and destroy cgroups by name in an
......@@ -68,7 +69,7 @@ On their own, the only use for cgroups is for simple job
tracking. The intention is that other subsystems hook into the generic
cgroup support to provide new attributes for cgroups, such as
accounting/limiting the resources which processes in a cgroup can
access. For example, cpusets (see Documentation/cpusets.txt) allows
access. For example, cpusets (see Documentation/cgroups/cpusets.txt) allows
you to associate a set of CPUs and a set of memory nodes with the
tasks in each cgroup.
......@@ -123,10 +124,10 @@ following lines:
/ \
Prof (15%) students (5%)
Browsers like firefox/lynx go into the WWW network class, while (k)nfsd go
Browsers like Firefox/Lynx go into the WWW network class, while (k)nfsd go
into NFS network class.
At the same time firefox/lynx will share an appropriate CPU/Memory class
At the same time Firefox/Lynx will share an appropriate CPU/Memory class
depending on who launched it (prof/student).
With the ability to classify tasks differently for different resources
......@@ -251,10 +252,8 @@ cgroup file system directories.
When a task is moved from one cgroup to another, it gets a new
css_set pointer - if there's an already existing css_set with the
desired collection of cgroups then that group is reused, else a new
css_set is allocated. Note that the current implementation uses a
linear search to locate an appropriate existing css_set, so isn't
very efficient. A future version will use a hash table for better
performance.
css_set is allocated. The appropriate existing css_set is located by
looking into a hash table.
To allow access from a cgroup to the css_sets (and hence tasks)
that comprise it, a set of cg_cgroup_link objects form a lattice;
......@@ -326,7 +325,7 @@ and then start a subshell 'sh' in that cgroup:
Creating, modifying, using the cgroups can be done through the cgroup
virtual filesystem.
To mount a cgroup hierarchy will all available subsystems, type:
To mount a cgroup hierarchy with all available subsystems, type:
# mount -t cgroup xxx /dev/cgroup
The "xxx" is not interpreted by the cgroup code, but will appear in
......@@ -334,12 +333,23 @@ The "xxx" is not interpreted by the cgroup code, but will appear in
To mount a cgroup hierarchy with just the cpuset and numtasks
subsystems, type:
# mount -t cgroup -o cpuset,numtasks hier1 /dev/cgroup
# mount -t cgroup -o cpuset,memory hier1 /dev/cgroup
To change the set of subsystems bound to a mounted hierarchy, just
remount with different options:
# mount -o remount,cpuset,ns hier1 /dev/cgroup
# mount -o remount,cpuset,ns /dev/cgroup
Now memory is removed from the hierarchy and ns is added.
Note this will add ns to the hierarchy but won't remove memory or
cpuset, because the new options are appended to the old ones:
# mount -o remount,ns /dev/cgroup
To Specify a hierarchy's release_agent:
# mount -t cgroup -o cpuset,release_agent="/sbin/cpuset_release_agent" \
xxx /dev/cgroup
Note that specifying 'release_agent' more than once will return failure.
Note that changing the set of subsystems is currently only supported
when the hierarchy consists of a single (root) cgroup. Supporting
......@@ -350,6 +360,11 @@ Then under /dev/cgroup you can find a tree that corresponds to the
tree of the cgroups in the system. For instance, /dev/cgroup
is the cgroup that holds the whole system.
If you want to change the value of release_agent:
# echo "/sbin/new_release_agent" > /dev/cgroup/release_agent
It can also be changed via remount.
If you want to create a new cgroup under /dev/cgroup:
# cd /dev/cgroup
# mkdir my_cgroup
......@@ -477,11 +492,13 @@ cgroup->parent is still valid. (Note - can also be called for a
newly-created cgroup if an error occurs after this subsystem's
create() method has been called for the new cgroup).
void pre_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp);
int pre_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp);
Called before checking the reference count on each subsystem. This may
be useful for subsystems which have some extra references even if
there are not tasks in the cgroup.
there are not tasks in the cgroup. If pre_destroy() returns error code,
rmdir() will fail with it. From this behavior, pre_destroy() can be
called multiple times against a cgroup.
int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
struct task_struct *task)
......@@ -522,7 +539,7 @@ always handled well.
void post_clone(struct cgroup_subsys *ss, struct cgroup *cgrp)
(cgroup_mutex held by caller)
Called at the end of cgroup_clone() to do any paramater
Called at the end of cgroup_clone() to do any parameter
initialization which might be required before a task could attach. For
example in cpusets, no task may attach before 'cpus' and 'mems' are set
up.
......
CPU Accounting Controller
-------------------------
The CPU accounting controller is used to group tasks using cgroups and
account the CPU usage of these groups of tasks.
The CPU accounting controller supports multi-hierarchy groups. An accounting
group accumulates the CPU usage of all of its child groups and the tasks
directly present in its group.
Accounting groups can be created by first mounting the cgroup filesystem.
# mkdir /cgroups
# mount -t cgroup -ocpuacct none /cgroups
With the above step, the initial or the parent accounting group
becomes visible at /cgroups. At bootup, this group includes all the
tasks in the system. /cgroups/tasks lists the tasks in this cgroup.
/cgroups/cpuacct.usage gives the CPU time (in nanoseconds) obtained by
this group which is essentially the CPU time obtained by all the tasks
in the system.
New accounting groups can be created under the parent group /cgroups.
# cd /cgroups
# mkdir g1
# echo $$ > g1
The above steps create a new group g1 and move the current shell
process (bash) into it. CPU time consumed by this bash and its children
can be obtained from g1/cpuacct.usage and the same is accumulated in
/cgroups/cpuacct.usage also.
cpuacct.stat file lists a few statistics which further divide the
CPU time obtained by the cgroup into user and system times. Currently
the following statistics are supported:
user: Time spent by tasks of the cgroup in user mode.
system: Time spent by tasks of the cgroup in kernel mode.
user and system are in USER_HZ unit.
cpuacct controller uses percpu_counter interface to collect user and
system times. This has two side effects:
- It is theoretically possible to see wrong values for user and system times.
This is because percpu_counter_read() on 32bit systems isn't safe
against concurrent writes.
- It is possible to see slightly outdated values for user and system times
due to the batch processing nature of percpu_counter.
此差异已折叠。
Device Whitelist Controller
1. Description:
Implement a cgroup to track and enforce open and mknod restrictions
on device files. A device cgroup associates a device access
whitelist with each cgroup. A whitelist entry has 4 fields.
'type' is a (all), c (char), or b (block). 'all' means it applies
to all types and all major and minor numbers. Major and minor are
either an integer or * for all. Access is a composition of r
(read), w (write), and m (mknod).
The root device cgroup starts with rwm to 'all'. A child device
cgroup gets a copy of the parent. Administrators can then remove
devices from the whitelist or add new entries. A child cgroup can
never receive a device access which is denied by its parent. However
when a device access is removed from a parent it will not also be
removed from the child(ren).
2. User Interface
An entry is added using devices.allow, and removed using
devices.deny. For instance
echo 'c 1:3 mr' > /cgroups/1/devices.allow
allows cgroup 1 to read and mknod the device usually known as
/dev/null. Doing
echo a > /cgroups/1/devices.deny
will remove the default 'a *:* rwm' entry. Doing
echo a > /cgroups/1/devices.allow
will add the 'a *:* rwm' entry to the whitelist.
3. Security
Any task can move itself between cgroups. This clearly won't
suffice, but we can decide the best way to adequately restrict
movement as people get some experience with this. We may just want
to require CAP_SYS_ADMIN, which at least is a separate bit from
CAP_MKNOD. We may want to just refuse moving to a cgroup which
isn't a descendant of the current one. Or we may want to use
CAP_MAC_ADMIN, since we really are trying to lock down root.
CAP_SYS_ADMIN is needed to modify the whitelist or move another
task to a new cgroup. (Again we'll probably want to change that).
A cgroup may not be granted more permissions than the cgroup's
parent has.
Memory Resource Controller(Memcg) Implementation Memo.
Last Updated: 2009/1/20
Base Kernel Version: based on 2.6.29-rc2.
Because VM is getting complex (one of reasons is memcg...), memcg's behavior
is complex. This is a document for memcg's internal behavior.
Please note that implementation details can be changed.
(*) Topics on API should be in Documentation/cgroups/memory.txt)
0. How to record usage ?
2 objects are used.
page_cgroup ....an object per page.
Allocated at boot or memory hotplug. Freed at memory hot removal.
swap_cgroup ... an entry per swp_entry.
Allocated at swapon(). Freed at swapoff().
The page_cgroup has USED bit and double count against a page_cgroup never
occurs. swap_cgroup is used only when a charged page is swapped-out.
1. Charge
a page/swp_entry may be charged (usage += PAGE_SIZE) at
mem_cgroup_newpage_charge()
Called at new page fault and Copy-On-Write.
mem_cgroup_try_charge_swapin()
Called at do_swap_page() (page fault on swap entry) and swapoff.
Followed by charge-commit-cancel protocol. (With swap accounting)
At commit, a charge recorded in swap_cgroup is removed.
mem_cgroup_cache_charge()
Called at add_to_page_cache()
mem_cgroup_cache_charge_swapin()
Called at shmem's swapin.
mem_cgroup_prepare_migration()
Called before migration. "extra" charge is done and followed by
charge-commit-cancel protocol.
At commit, charge against oldpage or newpage will be committed.
2. Uncharge
a page/swp_entry may be uncharged (usage -= PAGE_SIZE) by
mem_cgroup_uncharge_page()
Called when an anonymous page is fully unmapped. I.e., mapcount goes
to 0. If the page is SwapCache, uncharge is delayed until
mem_cgroup_uncharge_swapcache().
mem_cgroup_uncharge_cache_page()
Called when a page-cache is deleted from radix-tree. If the page is
SwapCache, uncharge is delayed until mem_cgroup_uncharge_swapcache().
mem_cgroup_uncharge_swapcache()
Called when SwapCache is removed from radix-tree. The charge itself
is moved to swap_cgroup. (If mem+swap controller is disabled, no
charge to swap occurs.)
mem_cgroup_uncharge_swap()
Called when swp_entry's refcnt goes down to 0. A charge against swap
disappears.
mem_cgroup_end_migration(old, new)
At success of migration old is uncharged (if necessary), a charge
to new page is committed. At failure, charge to old page is committed.
3. charge-commit-cancel
In some case, we can't know this "charge" is valid or not at charging
(because of races).
To handle such case, there are charge-commit-cancel functions.
mem_cgroup_try_charge_XXX
mem_cgroup_commit_charge_XXX
mem_cgroup_cancel_charge_XXX
these are used in swap-in and migration.
At try_charge(), there are no flags to say "this page is charged".
at this point, usage += PAGE_SIZE.
At commit(), the function checks the page should be charged or not
and set flags or avoid charging.(usage -= PAGE_SIZE)
At cancel(), simply usage -= PAGE_SIZE.
Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
4. Anonymous
Anonymous page is newly allocated at
- page fault into MAP_ANONYMOUS mapping.
- Copy-On-Write.
It is charged right after it's allocated before doing any page table
related operations. Of course, it's uncharged when another page is used
for the fault address.
At freeing anonymous page (by exit() or munmap()), zap_pte() is called
and pages for ptes are freed one by one.(see mm/memory.c). Uncharges
are done at page_remove_rmap() when page_mapcount() goes down to 0.
Another page freeing is by page-reclaim (vmscan.c) and anonymous
pages are swapped out. In this case, the page is marked as
PageSwapCache(). uncharge() routine doesn't uncharge the page marked
as SwapCache(). It's delayed until __delete_from_swap_cache().
4.1 Swap-in.
At swap-in, the page is taken from swap-cache. There are 2 cases.
(a) If the SwapCache is newly allocated and read, it has no charges.
(b) If the SwapCache has been mapped by processes, it has been
charged already.
This swap-in is one of the most complicated work. In do_swap_page(),
following events occur when pte is unchanged.
(1) the page (SwapCache) is looked up.
(2) lock_page()
(3) try_charge_swapin()
(4) reuse_swap_page() (may call delete_swap_cache())
(5) commit_charge_swapin()
(6) swap_free().
Considering following situation for example.
(A) The page has not been charged before (2) and reuse_swap_page()
doesn't call delete_from_swap_cache().
(B) The page has not been charged before (2) and reuse_swap_page()
calls delete_from_swap_cache().
(C) The page has been charged before (2) and reuse_swap_page() doesn't
call delete_from_swap_cache().
(D) The page has been charged before (2) and reuse_swap_page() calls
delete_from_swap_cache().
memory.usage/memsw.usage changes to this page/swp_entry will be
Case (A) (B) (C) (D)
Event
Before (2) 0/ 1 0/ 1 1/ 1 1/ 1
===========================================
(3) +1/+1 +1/+1 +1/+1 +1/+1
(4) - 0/ 0 - -1/ 0
(5) 0/-1 0/ 0 -1/-1 0/ 0
(6) - 0/-1 - 0/-1
===========================================
Result 1/ 1 1/ 1 1/ 1 1/ 1
In any cases, charges to this page should be 1/ 1.
4.2 Swap-out.
At swap-out, typical state transition is below.
(a) add to swap cache. (marked as SwapCache)
swp_entry's refcnt += 1.
(b) fully unmapped.
swp_entry's refcnt += # of ptes.
(c) write back to swap.
(d) delete from swap cache. (remove from SwapCache)
swp_entry's refcnt -= 1.
At (b), the page is marked as SwapCache and not uncharged.
At (d), the page is removed from SwapCache and a charge in page_cgroup
is moved to swap_cgroup.
Finally, at task exit,
(e) zap_pte() is called and swp_entry's refcnt -=1 -> 0.
Here, a charge in swap_cgroup disappears.
5. Page Cache
Page Cache is charged at
- add_to_page_cache_locked().
uncharged at
- __remove_from_page_cache().
The logic is very clear. (About migration, see below)
Note: __remove_from_page_cache() is called by remove_from_page_cache()
and __remove_mapping().
6. Shmem(tmpfs) Page Cache
Memcg's charge/uncharge have special handlers of shmem. The best way
to understand shmem's page state transition is to read mm/shmem.c.
But brief explanation of the behavior of memcg around shmem will be
helpful to understand the logic.
Shmem's page (just leaf page, not direct/indirect block) can be on
- radix-tree of shmem's inode.
- SwapCache.
- Both on radix-tree and SwapCache. This happens at swap-in
and swap-out,
It's charged when...
- A new page is added to shmem's radix-tree.
- A swp page is read. (move a charge from swap_cgroup to page_cgroup)
It's uncharged when
- A page is removed from radix-tree and not SwapCache.
- When SwapCache is removed, a charge is moved to swap_cgroup.
- When swp_entry's refcnt goes down to 0, a charge in swap_cgroup
disappears.
7. Page Migration
One of the most complicated functions is page-migration-handler.
Memcg has 2 routines. Assume that we are migrating a page's contents
from OLDPAGE to NEWPAGE.
Usual migration logic is..
(a) remove the page from LRU.
(b) allocate NEWPAGE (migration target)
(c) lock by lock_page().
(d) unmap all mappings.
(e-1) If necessary, replace entry in radix-tree.
(e-2) move contents of a page.
(f) map all mappings again.
(g) pushback the page to LRU.
(-) OLDPAGE will be freed.
Before (g), memcg should complete all necessary charge/uncharge to
NEWPAGE/OLDPAGE.
The point is....
- If OLDPAGE is anonymous, all charges will be dropped at (d) because
try_to_unmap() drops all mapcount and the page will not be
SwapCache.
- If OLDPAGE is SwapCache, charges will be kept at (g) because
__delete_from_swap_cache() isn't called at (e-1)
- If OLDPAGE is page-cache, charges will be kept at (g) because
remove_from_swap_cache() isn't called at (e-1)
memcg provides following hooks.
- mem_cgroup_prepare_migration(OLDPAGE)
Called after (b) to account a charge (usage += PAGE_SIZE) against
memcg which OLDPAGE belongs to.
- mem_cgroup_end_migration(OLDPAGE, NEWPAGE)
Called after (f) before (g).
If OLDPAGE is used, commit OLDPAGE again. If OLDPAGE is already
charged, a charge by prepare_migration() is automatically canceled.
If NEWPAGE is used, commit NEWPAGE and uncharge OLDPAGE.
But zap_pte() (by exit or munmap) can be called while migration,
we have to check if OLDPAGE/NEWPAGE is a valid page after commit().
8. LRU
Each memcg has its own private LRU. Now, it's handling is under global
VM's control (means that it's handled under global zone->lru_lock).
Almost all routines around memcg's LRU is called by global LRU's
list management functions under zone->lru_lock().
A special function is mem_cgroup_isolate_pages(). This scans
memcg's private LRU and call __isolate_lru_page() to extract a page
from LRU.
(By __isolate_lru_page(), the page is removed from both of global and
private LRU.)
9. Typical Tests.
Tests for racy cases.
9.1 Small limit to memcg.
When you do test to do racy case, it's good test to set memcg's limit
to be very small rather than GB. Many races found in the test under
xKB or xxMB limits.
(Memory behavior under GB and Memory behavior under MB shows very
different situation.)
9.2 Shmem
Historically, memcg's shmem handling was poor and we saw some amount
of troubles here. This is because shmem is page-cache but can be
SwapCache. Test with shmem/tmpfs is always good test.
9.3 Migration
For NUMA, migration is an another special case. To do easy test, cpuset
is useful. Following is a sample script to do migration.
mount -t cgroup -o cpuset none /opt/cpuset
mkdir /opt/cpuset/01
echo 1 > /opt/cpuset/01/cpuset.cpus
echo 0 > /opt/cpuset/01/cpuset.mems
echo 1 > /opt/cpuset/01/cpuset.memory_migrate
mkdir /opt/cpuset/02
echo 1 > /opt/cpuset/02/cpuset.cpus
echo 1 > /opt/cpuset/02/cpuset.mems
echo 1 > /opt/cpuset/02/cpuset.memory_migrate
In above set, when you moves a task from 01 to 02, page migration to
node 0 to node 1 will occur. Following is a script to migrate all
under cpuset.
--
move_task()
{
for pid in $1
do
/bin/echo $pid >$2/tasks 2>/dev/null
echo -n $pid
echo -n " "
done
echo END
}
G1_TASK=`cat ${G1}/tasks`
G2_TASK=`cat ${G2}/tasks`
move_task "${G1_TASK}" ${G2} &
--
9.4 Memory hotplug.
memory hotplug test is one of good test.
to offline memory, do following.
# echo offline > /sys/devices/system/memory/memoryXXX/state
(XXX is the place of memory)
This is an easy way to test page migration, too.
9.5 mkdir/rmdir
When using hierarchy, mkdir/rmdir test should be done.
Use tests like the following.
echo 1 >/opt/cgroup/01/memory/use_hierarchy
mkdir /opt/cgroup/01/child_a
mkdir /opt/cgroup/01/child_b
set limit to 01.
add limit to 01/child_b
run jobs under child_a and child_b
create/delete following groups at random while jobs are running.
/opt/cgroup/01/child_a/child_aa
/opt/cgroup/01/child_b/child_bb
/opt/cgroup/01/child_c
running new jobs in new group is also good.
9.6 Mount with other subsystems.
Mounting with other subsystems is a good test because there is a
race and lock dependency with other cgroup subsystems.
example)
# mount -t cgroup none /cgroup -t cpuset,memory,cpu,devices
and do task move, mkdir, rmdir etc...under this.
9.7 swapoff.
Besides management of swap is one of complicated parts of memcg,
call path of swap-in at swapoff is not same as usual swap-in path..
It's worth to be tested explicitly.
For example, test like following is good.
(Shell-A)
# mount -t cgroup none /cgroup -t memory
# mkdir /cgroup/test
# echo 40M > /cgroup/test/memory.limit_in_bytes
# echo 0 > /cgroup/test/tasks
Run malloc(100M) program under this. You'll see 60M of swaps.
(Shell-B)
# move all tasks in /cgroup/test to /cgroup
# /sbin/swapoff -a
# rmdir /cgroup/test
# kill malloc task.
Of course, tmpfs v.s. swapoff test should be tested, too.
9.8 OOM-Killer
Out-of-memory caused by memcg's limit will kill tasks under
the memcg. When hierarchy is used, a task under hierarchy
will be killed by the kernel.
In this case, panic_on_oom shouldn't be invoked and tasks
in other groups shouldn't be killed.
It's not difficult to cause OOM under memcg as following.
Case A) when you can swapoff
#swapoff -a
#echo 50M > /memory.limit_in_bytes
run 51M of malloc
Case B) when you use mem+swap limitation.
#echo 50M > memory.limit_in_bytes
#echo 50M > memory.memsw.limit_in_bytes
run 51M of malloc
此差异已折叠。
此差异已折叠。
......@@ -137,7 +137,7 @@ static void cn_test_timer_func(unsigned long __data)
memcpy(m + 1, data, m->len);
cn_netlink_send(m, 0, gfp_any());
cn_netlink_send(m, 0, GFP_ATOMIC);
kfree(m);
}
......@@ -160,10 +160,8 @@ static int cn_test_init(void)
goto err_out;
}
init_timer(&cn_test_timer);
cn_test_timer.function = cn_test_timer_func;
setup_timer(&cn_test_timer, cn_test_timer_func, 0);
cn_test_timer.expires = jiffies + HZ;
cn_test_timer.data = 0;
add_timer(&cn_test_timer);
return 0;
......
CPU Accounting Controller
-------------------------
The CPU accounting controller is used to group tasks using cgroups and
account the CPU usage of these groups of tasks.
The CPU accounting controller supports multi-hierarchy groups. An accounting
group accumulates the CPU usage of all of its child groups and the tasks
directly present in its group.
Accounting groups can be created by first mounting the cgroup filesystem.
# mkdir /cgroups
# mount -t cgroup -ocpuacct none /cgroups
With the above step, the initial or the parent accounting group
becomes visible at /cgroups. At bootup, this group includes all the
tasks in the system. /cgroups/tasks lists the tasks in this cgroup.
/cgroups/cpuacct.usage gives the CPU time (in nanoseconds) obtained by
this group which is essentially the CPU time obtained by all the tasks
in the system.
New accounting groups can be created under the parent group /cgroups.
# cd /cgroups
# mkdir g1
# echo $$ > g1
The above steps create a new group g1 and move the current shell
process (bash) into it. CPU time consumed by this bash and its children
can be obtained from g1/cpuacct.usage and the same is accumulated in
/cgroups/cpuacct.usage also.
Device Whitelist Controller
1. Description:
Implement a cgroup to track and enforce open and mknod restrictions
on device files. A device cgroup associates a device access
whitelist with each cgroup. A whitelist entry has 4 fields.
'type' is a (all), c (char), or b (block). 'all' means it applies
to all types and all major and minor numbers. Major and minor are
either an integer or * for all. Access is a composition of r
(read), w (write), and m (mknod).
The root device cgroup starts with rwm to 'all'. A child device
cgroup gets a copy of the parent. Administrators can then remove
devices from the whitelist or add new entries. A child cgroup can
never receive a device access which is denied by its parent. However
when a device access is removed from a parent it will not also be
removed from the child(ren).
2. User Interface
An entry is added using devices.allow, and removed using
devices.deny. For instance
echo 'c 1:3 mr' > /cgroups/1/devices.allow
allows cgroup 1 to read and mknod the device usually known as
/dev/null. Doing
echo a > /cgroups/1/devices.deny
will remove the default 'a *:* rwm' entry. Doing
echo a > /cgroups/1/devices.allow
will add the 'a *:* rwm' entry to the whitelist.
3. Security
Any task can move itself between cgroups. This clearly won't
suffice, but we can decide the best way to adequately restrict
movement as people get some experience with this. We may just want
to require CAP_SYS_ADMIN, which at least is a separate bit from
CAP_MKNOD. We may want to just refuse moving to a cgroup which
isn't a descendent of the current one. Or we may want to use
CAP_MAC_ADMIN, since we really are trying to lock down root.
CAP_SYS_ADMIN is needed to modify the whitelist or move another
task to a new cgroup. (Again we'll probably want to change that).
A cgroup may not be granted more permissions than the cgroup's
parent has.
此差异已折叠。
此差异已折叠。
......@@ -117,10 +117,28 @@ accessible parameters:
sampling_rate: measured in uS (10^-6 seconds), this is how often you
want the kernel to look at the CPU usage and to make decisions on
what to do about the frequency. Typically this is set to values of
around '10000' or more.
show_sampling_rate_(min|max): the minimum and maximum sampling rates
available that you may set 'sampling_rate' to.
around '10000' or more. It's default value is (cmp. with users-guide.txt):
transition_latency * 1000
The lowest value you can set is:
transition_latency * 100 or it may get restricted to a value where it
makes not sense for the kernel anymore to poll that often which depends
on your HZ config variable (HZ=1000: max=20000us, HZ=250: max=5000).
Be aware that transition latency is in ns and sampling_rate is in us, so you
get the same sysfs value by default.
Sampling rate should always get adjusted considering the transition latency
To set the sampling rate 750 times as high as the transition latency
in the bash (as said, 1000 is default), do:
echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) \
>ondemand/sampling_rate
show_sampling_rate_(min|max): THIS INTERFACE IS DEPRECATED, DON'T USE IT.
You can use wider ranges now and the general
cpuinfo_transition_latency variable (cmp. with user-guide.txt) can be
used to obtain exactly the same info:
show_sampling_rate_min = transtition_latency * 500 / 1000
show_sampling_rate_max = transtition_latency * 500000 / 1000
(divided by 1000 is to illustrate that sampling rate is in us and
transition latency is exported ns).
up_threshold: defines what the average CPU usage between the samplings
of 'sampling_rate' needs to be for the kernel to make a decision on
......
此差异已折叠。
......@@ -18,11 +18,11 @@ For an architecture to support this feature, it must define some of
these macros in include/asm-XXX/topology.h:
#define topology_physical_package_id(cpu)
#define topology_core_id(cpu)
#define topology_thread_siblings(cpu)
#define topology_core_siblings(cpu)
#define topology_thread_cpumask(cpu)
#define topology_core_cpumask(cpu)
The type of **_id is int.
The type of siblings is cpumask_t.
The type of siblings is (const) struct cpumask *.
To be consistent on all architectures, include/linux/topology.h
provides default definitions for any of the above macros that are
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册