提交 169ed55b 编写于 作者: I Ingo Molnar

Merge branch 'tip/perf/jump-label-2' of...

Merge branch 'tip/perf/jump-label-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/urgent

要显示的变更太多。

To preserve performance only 1000 of 1000+ files are displayed.
What: dv1394 (a.k.a. "OHCI-DV I/O support" for FireWire)
Contact: linux1394-devel@lists.sourceforge.net
Description:
New application development should use raw1394 + userspace libraries
instead, notably libiec61883 which is functionally equivalent.
Users:
ffmpeg/libavformat (used by a variety of media players)
dvgrab v1.x (replaced by dvgrab2 on top of raw1394 and resp. libraries)
What: dv1394 (a.k.a. "OHCI-DV I/O support" for FireWire)
Date: May 2010 (scheduled), finally removed in kernel v2.6.37
Contact: linux1394-devel@lists.sourceforge.net
Description:
/dev/dv1394/* were character device files, one for each FireWire
controller and for NTSC and PAL respectively, from which DV data
could be received by read() or transmitted by write(). A few
ioctl()s allowed limited control.
This special-purpose interface has been superseded by libraw1394 +
libiec61883 which are functionally equivalent, support HDV, and
transparently work on top of the newer firewire kernel drivers.
Users:
ffmpeg/libavformat (if configured for DV1394)
What: raw1394 (a.k.a. "Raw IEEE1394 I/O support" for FireWire)
Date: May 2010 (scheduled), finally removed in kernel v2.6.37
Contact: linux1394-devel@lists.sourceforge.net
Description:
/dev/raw1394 was a character device file that allowed low-level
access to FireWire buses. Its major drawbacks were its inability
to implement sensible device security policies, and its low level
of abstraction that required userspace clients do duplicate much
of the kernel's ieee1394 core functionality.
Replaced by /dev/fw*, i.e. the <linux/firewire-cdev.h> ABI of
firewire-core.
Users:
libraw1394 (works with firewire-cdev too, transparent to library ABI
users)
What: legacy isochronous ABI of raw1394 (1st generation iso ABI)
Date: June 2007 (scheduled), removed in kernel v2.6.23
Contact: linux1394-devel@lists.sourceforge.net
Description:
The two request types RAW1394_REQ_ISO_SEND, RAW1394_REQ_ISO_LISTEN have
been deprecated for quite some time. They are very inefficient as they
come with high interrupt load and several layers of callbacks for each
packet. Because of these deficiencies, the video1394 and dv1394 drivers
and the 3rd-generation isochronous ABI in raw1394 (rawiso) were created.
Users:
libraw1394 users via the long deprecated API raw1394_iso_write,
raw1394_start_iso_write, raw1394_start_iso_rcv, raw1394_stop_iso_rcv
libdc1394, which optionally uses these old libraw1394 calls
alternatively to the more efficient video1394 ABI
What: video1394 (a.k.a. "OHCI-1394 Video support" for FireWire)
Date: May 2010 (scheduled), finally removed in kernel v2.6.37
Contact: linux1394-devel@lists.sourceforge.net
Description:
/dev/video1394/* were character device files, one for each FireWire
controller, which were used for isochronous I/O. It was added as an
alternative to raw1394's isochronous I/O functionality which had
performance issues in its first generation. Any video1394 user had
to use raw1394 + libraw1394 too because video1394 did not provide
asynchronous I/O for device discovery and configuration.
Replaced by /dev/fw*, i.e. the <linux/firewire-cdev.h> ABI of
firewire-core.
Users:
libdc1394 (works with firewire-cdev too, transparent to library ABI
users)
What: state
Date: Sep 2010
KernelVersion: 2.6.37
Contact: Vernon Mauery <vernux@us.ibm.com>
Description: The state file allows a means by which to change in and
out of Premium Real-Time Mode (PRTM), as well as the
ability to query the current state.
0 => PRTM off
1 => PRTM enabled
Users: The ibm-prtm userspace daemon uses this interface.
What: version
Date: Sep 2010
KernelVersion: 2.6.37
Contact: Vernon Mauery <vernux@us.ibm.com>
Description: The version file provides a means by which to query
the RTL table version that lives in the Extended
BIOS Data Area (EBDA).
Users: The ibm-prtm userspace daemon uses this interface.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/actual_cpi
Date: August 2010
Contact: Stefan Achatz <erazor_de@users.sourceforge.net>
Description: It is possible to switch the cpi setting of the mouse with the
press of a button.
When read, this file returns the raw number of the actual cpi
setting reported by the mouse. This number has to be further
processed to receive the real dpi value.
VALUE DPI
1 400
2 800
4 1600
This file is readonly.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/actual_profile
Date: August 2010
Contact: Stefan Achatz <erazor_de@users.sourceforge.net>
Description: When read, this file returns the number of the actual profile in
range 0-4.
This file is readonly.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/firmware_version
Date: August 2010
Contact: Stefan Achatz <erazor_de@users.sourceforge.net>
Description: When read, this file returns the raw integer version number of the
firmware reported by the mouse. Using the integer value eases
further usage in other programs. To receive the real version
number the decimal point has to be shifted 2 positions to the
left. E.g. a returned value of 138 means 1.38
This file is readonly.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/profile_settings
Date: August 2010
Contact: Stefan Achatz <erazor_de@users.sourceforge.net>
Description: The mouse can store 5 profiles which can be switched by the
press of a button. A profile is split in settings and buttons.
profile_settings holds informations like resolution, sensitivity
and light effects.
When written, this file lets one write the respective profile
settings back to the mouse. The data has to be 13 bytes long.
The mouse will reject invalid data.
Which profile to write is determined by the profile number
contained in the data.
This file is writeonly.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/profile[1-5]_settings
Date: August 2010
Contact: Stefan Achatz <erazor_de@users.sourceforge.net>
Description: The mouse can store 5 profiles which can be switched by the
press of a button. A profile is split in settings and buttons.
profile_settings holds informations like resolution, sensitivity
and light effects.
When read, these files return the respective profile settings.
The returned data is 13 bytes in size.
This file is readonly.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/profile_buttons
Date: August 2010
Contact: Stefan Achatz <erazor_de@users.sourceforge.net>
Description: The mouse can store 5 profiles which can be switched by the
press of a button. A profile is split in settings and buttons.
profile_buttons holds informations about button layout.
When written, this file lets one write the respective profile
buttons back to the mouse. The data has to be 19 bytes long.
The mouse will reject invalid data.
Which profile to write is determined by the profile number
contained in the data.
This file is writeonly.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/profile[1-5]_buttons
Date: August 2010
Contact: Stefan Achatz <erazor_de@users.sourceforge.net>
Description: The mouse can store 5 profiles which can be switched by the
press of a button. A profile is split in settings and buttons.
profile_buttons holds informations about button layout.
When read, these files return the respective profile buttons.
The returned data is 19 bytes in size.
This file is readonly.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/startup_profile
Date: August 2010
Contact: Stefan Achatz <erazor_de@users.sourceforge.net>
Description: The integer value of this attribute ranges from 0-4.
When read, this attribute returns the number of the profile
that's active when the mouse is powered on.
This file is readonly.
What: /sys/bus/usb/devices/<busnum>-<devnum>:<config num>.<interface num>/settings
Date: August 2010
Contact: Stefan Achatz <erazor_de@users.sourceforge.net>
Description: When read, this file returns the settings stored in the mouse.
The size of the data is 3 bytes and holds information on the
startup_profile.
When written, this file lets write settings back to the mouse.
The data has to be 3 bytes long. The mouse will reject invalid
data.
What: /sys/module/pch_phub/drivers/.../pch_mac
Date: August 2010
KernelVersion: 2.6.35
Contact: masa-korg@dsn.okisemi.com
Description: Write/read GbE MAC address.
What: /sys/module/pch_phub/drivers/.../pch_firmware
Date: August 2010
KernelVersion: 2.6.35
Contact: masa-korg@dsn.okisemi.com
Description: Write/read Option ROM data.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE set PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
<set>
<setinfo>
<title>The 802.11 subsystems &ndash; for kernel developers</title>
<subtitle>
Explaining wireless 802.11 networking in the Linux kernel
</subtitle>
<copyright>
<year>2007-2009</year>
<holder>Johannes Berg</holder>
</copyright>
<authorgroup>
<author>
<firstname>Johannes</firstname>
<surname>Berg</surname>
<affiliation>
<address><email>johannes@sipsolutions.net</email></address>
</affiliation>
</author>
</authorgroup>
<legalnotice>
<para>
This documentation is free software; you can redistribute
it and/or modify it under the terms of the GNU General Public
License version 2 as published by the Free Software Foundation.
</para>
<para>
This documentation is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU General Public License for more details.
</para>
<para>
You should have received a copy of the GNU General Public
License along with this documentation; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA 02111-1307 USA
</para>
<para>
For more details see the file COPYING in the source
distribution of Linux.
</para>
</legalnotice>
<abstract>
<para>
These books attempt to give a description of the
various subsystems that play a role in 802.11 wireless
networking in Linux. Since these books are for kernel
developers they attempts to document the structures
and functions used in the kernel as well as giving a
higher-level overview.
</para>
<para>
The reader is expected to be familiar with the 802.11
standard as published by the IEEE in 802.11-2007 (or
possibly later versions). References to this standard
will be given as "802.11-2007 8.1.5".
</para>
</abstract>
</setinfo>
<book id="cfg80211-developers-guide">
<bookinfo>
<title>The cfg80211 subsystem</title>
<abstract>
!Pinclude/net/cfg80211.h Introduction
</abstract>
</bookinfo>
<chapter>
<title>Device registration</title>
!Pinclude/net/cfg80211.h Device registration
!Finclude/net/cfg80211.h ieee80211_band
!Finclude/net/cfg80211.h ieee80211_channel_flags
!Finclude/net/cfg80211.h ieee80211_channel
!Finclude/net/cfg80211.h ieee80211_rate_flags
!Finclude/net/cfg80211.h ieee80211_rate
!Finclude/net/cfg80211.h ieee80211_sta_ht_cap
!Finclude/net/cfg80211.h ieee80211_supported_band
!Finclude/net/cfg80211.h cfg80211_signal_type
!Finclude/net/cfg80211.h wiphy_params_flags
!Finclude/net/cfg80211.h wiphy_flags
!Finclude/net/cfg80211.h wiphy
!Finclude/net/cfg80211.h wireless_dev
!Finclude/net/cfg80211.h wiphy_new
!Finclude/net/cfg80211.h wiphy_register
!Finclude/net/cfg80211.h wiphy_unregister
!Finclude/net/cfg80211.h wiphy_free
!Finclude/net/cfg80211.h wiphy_name
!Finclude/net/cfg80211.h wiphy_dev
!Finclude/net/cfg80211.h wiphy_priv
!Finclude/net/cfg80211.h priv_to_wiphy
!Finclude/net/cfg80211.h set_wiphy_dev
!Finclude/net/cfg80211.h wdev_priv
</chapter>
<chapter>
<title>Actions and configuration</title>
!Pinclude/net/cfg80211.h Actions and configuration
!Finclude/net/cfg80211.h cfg80211_ops
!Finclude/net/cfg80211.h vif_params
!Finclude/net/cfg80211.h key_params
!Finclude/net/cfg80211.h survey_info_flags
!Finclude/net/cfg80211.h survey_info
!Finclude/net/cfg80211.h beacon_parameters
!Finclude/net/cfg80211.h plink_actions
!Finclude/net/cfg80211.h station_parameters
!Finclude/net/cfg80211.h station_info_flags
!Finclude/net/cfg80211.h rate_info_flags
!Finclude/net/cfg80211.h rate_info
!Finclude/net/cfg80211.h station_info
!Finclude/net/cfg80211.h monitor_flags
!Finclude/net/cfg80211.h mpath_info_flags
!Finclude/net/cfg80211.h mpath_info
!Finclude/net/cfg80211.h bss_parameters
!Finclude/net/cfg80211.h ieee80211_txq_params
!Finclude/net/cfg80211.h cfg80211_crypto_settings
!Finclude/net/cfg80211.h cfg80211_auth_request
!Finclude/net/cfg80211.h cfg80211_assoc_request
!Finclude/net/cfg80211.h cfg80211_deauth_request
!Finclude/net/cfg80211.h cfg80211_disassoc_request
!Finclude/net/cfg80211.h cfg80211_ibss_params
!Finclude/net/cfg80211.h cfg80211_connect_params
!Finclude/net/cfg80211.h cfg80211_pmksa
!Finclude/net/cfg80211.h cfg80211_send_rx_auth
!Finclude/net/cfg80211.h cfg80211_send_auth_timeout
!Finclude/net/cfg80211.h __cfg80211_auth_canceled
!Finclude/net/cfg80211.h cfg80211_send_rx_assoc
!Finclude/net/cfg80211.h cfg80211_send_assoc_timeout
!Finclude/net/cfg80211.h cfg80211_send_deauth
!Finclude/net/cfg80211.h __cfg80211_send_deauth
!Finclude/net/cfg80211.h cfg80211_send_disassoc
!Finclude/net/cfg80211.h __cfg80211_send_disassoc
!Finclude/net/cfg80211.h cfg80211_ibss_joined
!Finclude/net/cfg80211.h cfg80211_connect_result
!Finclude/net/cfg80211.h cfg80211_roamed
!Finclude/net/cfg80211.h cfg80211_disconnected
!Finclude/net/cfg80211.h cfg80211_ready_on_channel
!Finclude/net/cfg80211.h cfg80211_remain_on_channel_expired
!Finclude/net/cfg80211.h cfg80211_new_sta
!Finclude/net/cfg80211.h cfg80211_rx_mgmt
!Finclude/net/cfg80211.h cfg80211_mgmt_tx_status
!Finclude/net/cfg80211.h cfg80211_cqm_rssi_notify
!Finclude/net/cfg80211.h cfg80211_michael_mic_failure
</chapter>
<chapter>
<title>Scanning and BSS list handling</title>
!Pinclude/net/cfg80211.h Scanning and BSS list handling
!Finclude/net/cfg80211.h cfg80211_ssid
!Finclude/net/cfg80211.h cfg80211_scan_request
!Finclude/net/cfg80211.h cfg80211_scan_done
!Finclude/net/cfg80211.h cfg80211_bss
!Finclude/net/cfg80211.h cfg80211_inform_bss_frame
!Finclude/net/cfg80211.h cfg80211_inform_bss
!Finclude/net/cfg80211.h cfg80211_unlink_bss
!Finclude/net/cfg80211.h cfg80211_find_ie
!Finclude/net/cfg80211.h ieee80211_bss_get_ie
</chapter>
<chapter>
<title>Utility functions</title>
!Pinclude/net/cfg80211.h Utility functions
!Finclude/net/cfg80211.h ieee80211_channel_to_frequency
!Finclude/net/cfg80211.h ieee80211_frequency_to_channel
!Finclude/net/cfg80211.h ieee80211_get_channel
!Finclude/net/cfg80211.h ieee80211_get_response_rate
!Finclude/net/cfg80211.h ieee80211_hdrlen
!Finclude/net/cfg80211.h ieee80211_get_hdrlen_from_skb
!Finclude/net/cfg80211.h ieee80211_radiotap_iterator
</chapter>
<chapter>
<title>Data path helpers</title>
!Pinclude/net/cfg80211.h Data path helpers
!Finclude/net/cfg80211.h ieee80211_data_to_8023
!Finclude/net/cfg80211.h ieee80211_data_from_8023
!Finclude/net/cfg80211.h ieee80211_amsdu_to_8023s
!Finclude/net/cfg80211.h cfg80211_classify8021d
</chapter>
<chapter>
<title>Regulatory enforcement infrastructure</title>
!Pinclude/net/cfg80211.h Regulatory enforcement infrastructure
!Finclude/net/cfg80211.h regulatory_hint
!Finclude/net/cfg80211.h wiphy_apply_custom_regulatory
!Finclude/net/cfg80211.h freq_reg_info
</chapter>
<chapter>
<title>RFkill integration</title>
!Pinclude/net/cfg80211.h RFkill integration
!Finclude/net/cfg80211.h wiphy_rfkill_set_hw_state
!Finclude/net/cfg80211.h wiphy_rfkill_start_polling
!Finclude/net/cfg80211.h wiphy_rfkill_stop_polling
</chapter>
<chapter>
<title>Test mode</title>
!Pinclude/net/cfg80211.h Test mode
!Finclude/net/cfg80211.h cfg80211_testmode_alloc_reply_skb
!Finclude/net/cfg80211.h cfg80211_testmode_reply
!Finclude/net/cfg80211.h cfg80211_testmode_alloc_event_skb
!Finclude/net/cfg80211.h cfg80211_testmode_event
</chapter>
</book>
<book id="mac80211-developers-guide">
<bookinfo>
<title>The mac80211 subsystem</title>
<abstract>
!Pinclude/net/mac80211.h Introduction
!Pinclude/net/mac80211.h Warning
</abstract>
</bookinfo>
<toc></toc>
<!--
Generally, this document shall be ordered by increasing complexity.
It is important to note that readers should be able to read only
the first few sections to get a working driver and only advanced
usage should require reading the full document.
-->
<part>
<title>The basic mac80211 driver interface</title>
<partintro>
<para>
You should read and understand the information contained
within this part of the book while implementing a driver.
In some chapters, advanced usage is noted, that may be
skipped at first.
</para>
<para>
This part of the book only covers station and monitor mode
functionality, additional information required to implement
the other modes is covered in the second part of the book.
</para>
</partintro>
<chapter id="basics">
<title>Basic hardware handling</title>
<para>TBD</para>
<para>
This chapter shall contain information on getting a hw
struct allocated and registered with mac80211.
</para>
<para>
Since it is required to allocate rates/modes before registering
a hw struct, this chapter shall also contain information on setting
up the rate/mode structs.
</para>
<para>
Additionally, some discussion about the callbacks and
the general programming model should be in here, including
the definition of ieee80211_ops which will be referred to
a lot.
</para>
<para>
Finally, a discussion of hardware capabilities should be done
with references to other parts of the book.
</para>
<!-- intentionally multiple !F lines to get proper order -->
!Finclude/net/mac80211.h ieee80211_hw
!Finclude/net/mac80211.h ieee80211_hw_flags
!Finclude/net/mac80211.h SET_IEEE80211_DEV
!Finclude/net/mac80211.h SET_IEEE80211_PERM_ADDR
!Finclude/net/mac80211.h ieee80211_ops
!Finclude/net/mac80211.h ieee80211_alloc_hw
!Finclude/net/mac80211.h ieee80211_register_hw
!Finclude/net/mac80211.h ieee80211_get_tx_led_name
!Finclude/net/mac80211.h ieee80211_get_rx_led_name
!Finclude/net/mac80211.h ieee80211_get_assoc_led_name
!Finclude/net/mac80211.h ieee80211_get_radio_led_name
!Finclude/net/mac80211.h ieee80211_unregister_hw
!Finclude/net/mac80211.h ieee80211_free_hw
</chapter>
<chapter id="phy-handling">
<title>PHY configuration</title>
<para>TBD</para>
<para>
This chapter should describe PHY handling including
start/stop callbacks and the various structures used.
</para>
!Finclude/net/mac80211.h ieee80211_conf
!Finclude/net/mac80211.h ieee80211_conf_flags
</chapter>
<chapter id="iface-handling">
<title>Virtual interfaces</title>
<para>TBD</para>
<para>
This chapter should describe virtual interface basics
that are relevant to the driver (VLANs, MGMT etc are not.)
It should explain the use of the add_iface/remove_iface
callbacks as well as the interface configuration callbacks.
</para>
<para>Things related to AP mode should be discussed there.</para>
<para>
Things related to supporting multiple interfaces should be
in the appropriate chapter, a BIG FAT note should be here about
this though and the recommendation to allow only a single
interface in STA mode at first!
</para>
!Finclude/net/mac80211.h ieee80211_vif
</chapter>
<chapter id="rx-tx">
<title>Receive and transmit processing</title>
<sect1>
<title>what should be here</title>
<para>TBD</para>
<para>
This should describe the receive and transmit
paths in mac80211/the drivers as well as
transmit status handling.
</para>
</sect1>
<sect1>
<title>Frame format</title>
!Pinclude/net/mac80211.h Frame format
</sect1>
<sect1>
<title>Packet alignment</title>
!Pnet/mac80211/rx.c Packet alignment
</sect1>
<sect1>
<title>Calling into mac80211 from interrupts</title>
!Pinclude/net/mac80211.h Calling mac80211 from interrupts
</sect1>
<sect1>
<title>functions/definitions</title>
!Finclude/net/mac80211.h ieee80211_rx_status
!Finclude/net/mac80211.h mac80211_rx_flags
!Finclude/net/mac80211.h ieee80211_tx_info
!Finclude/net/mac80211.h ieee80211_rx
!Finclude/net/mac80211.h ieee80211_rx_irqsafe
!Finclude/net/mac80211.h ieee80211_tx_status
!Finclude/net/mac80211.h ieee80211_tx_status_irqsafe
!Finclude/net/mac80211.h ieee80211_rts_get
!Finclude/net/mac80211.h ieee80211_rts_duration
!Finclude/net/mac80211.h ieee80211_ctstoself_get
!Finclude/net/mac80211.h ieee80211_ctstoself_duration
!Finclude/net/mac80211.h ieee80211_generic_frame_duration
!Finclude/net/mac80211.h ieee80211_wake_queue
!Finclude/net/mac80211.h ieee80211_stop_queue
!Finclude/net/mac80211.h ieee80211_wake_queues
!Finclude/net/mac80211.h ieee80211_stop_queues
</sect1>
</chapter>
<chapter id="filters">
<title>Frame filtering</title>
!Pinclude/net/mac80211.h Frame filtering
!Finclude/net/mac80211.h ieee80211_filter_flags
</chapter>
</part>
<part id="advanced">
<title>Advanced driver interface</title>
<partintro>
<para>
Information contained within this part of the book is
of interest only for advanced interaction of mac80211
with drivers to exploit more hardware capabilities and
improve performance.
</para>
</partintro>
<chapter id="hardware-crypto-offload">
<title>Hardware crypto acceleration</title>
!Pinclude/net/mac80211.h Hardware crypto acceleration
<!-- intentionally multiple !F lines to get proper order -->
!Finclude/net/mac80211.h set_key_cmd
!Finclude/net/mac80211.h ieee80211_key_conf
!Finclude/net/mac80211.h ieee80211_key_flags
</chapter>
<chapter id="powersave">
<title>Powersave support</title>
!Pinclude/net/mac80211.h Powersave support
</chapter>
<chapter id="beacon-filter">
<title>Beacon filter support</title>
!Pinclude/net/mac80211.h Beacon filter support
!Finclude/net/mac80211.h ieee80211_beacon_loss
</chapter>
<chapter id="qos">
<title>Multiple queues and QoS support</title>
<para>TBD</para>
!Finclude/net/mac80211.h ieee80211_tx_queue_params
</chapter>
<chapter id="AP">
<title>Access point mode support</title>
<para>TBD</para>
<para>Some parts of the if_conf should be discussed here instead</para>
<para>
Insert notes about VLAN interfaces with hw crypto here or
in the hw crypto chapter.
</para>
!Finclude/net/mac80211.h ieee80211_get_buffered_bc
!Finclude/net/mac80211.h ieee80211_beacon_get
</chapter>
<chapter id="multi-iface">
<title>Supporting multiple virtual interfaces</title>
<para>TBD</para>
<para>
Note: WDS with identical MAC address should almost always be OK
</para>
<para>
Insert notes about having multiple virtual interfaces with
different MAC addresses here, note which configurations are
supported by mac80211, add notes about supporting hw crypto
with it.
</para>
</chapter>
<chapter id="hardware-scan-offload">
<title>Hardware scan offload</title>
<para>TBD</para>
!Finclude/net/mac80211.h ieee80211_scan_completed
</chapter>
</part>
<part id="rate-control">
<title>Rate control interface</title>
<partintro>
<para>TBD</para>
<para>
This part of the book describes the rate control algorithm
interface and how it relates to mac80211 and drivers.
</para>
</partintro>
<chapter id="dummy">
<title>dummy chapter</title>
<para>TBD</para>
</chapter>
</part>
<part id="internal">
<title>Internals</title>
<partintro>
<para>TBD</para>
<para>
This part of the book describes mac80211 internals.
</para>
</partintro>
<chapter id="key-handling">
<title>Key handling</title>
<sect1>
<title>Key handling basics</title>
!Pnet/mac80211/key.c Key handling basics
</sect1>
<sect1>
<title>MORE TBD</title>
<para>TBD</para>
</sect1>
</chapter>
<chapter id="rx-processing">
<title>Receive processing</title>
<para>TBD</para>
</chapter>
<chapter id="tx-processing">
<title>Transmit processing</title>
<para>TBD</para>
</chapter>
<chapter id="sta-info">
<title>Station info handling</title>
<sect1>
<title>Programming information</title>
!Fnet/mac80211/sta_info.h sta_info
!Fnet/mac80211/sta_info.h ieee80211_sta_info_flags
</sect1>
<sect1>
<title>STA information lifetime rules</title>
!Pnet/mac80211/sta_info.c STA information lifetime rules
</sect1>
</chapter>
<chapter id="synchronisation">
<title>Synchronisation</title>
<para>TBD</para>
<para>Locking, lots of RCU</para>
</chapter>
</part>
</book>
</set>
......@@ -12,7 +12,7 @@ DOCBOOKS := z8530book.xml mcabook.xml device-drivers.xml \
kernel-api.xml filesystems.xml lsm.xml usb.xml kgdb.xml \
gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \
genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
mac80211.xml debugobjects.xml sh.xml regulator.xml \
80211.xml debugobjects.xml sh.xml regulator.xml \
alsa-driver-api.xml writing-an-alsa-driver.xml \
tracepoint.xml media.xml drm.xml
......
......@@ -51,7 +51,12 @@
<sect1><title>Delaying, scheduling, and timer routines</title>
!Iinclude/linux/sched.h
!Ekernel/sched.c
!Iinclude/linux/completion.h
!Ekernel/timer.c
</sect1>
<sect1><title>Wait queues and Wake events</title>
!Iinclude/linux/wait.h
!Ekernel/wait.c
</sect1>
<sect1><title>High-resolution timers</title>
!Iinclude/linux/ktime.h
......
......@@ -136,6 +136,7 @@
#ifdef CONFIG_COMPAT
.compat_ioctl = i915_compat_ioctl,
#endif
.llseek = noop_llseek,
},
.pci_driver = {
.name = DRIVER_NAME,
......
......@@ -93,6 +93,12 @@ X!Ilib/string.c
!Elib/crc32.c
!Elib/crc-ccitt.c
</sect1>
<sect1 id="idr"><title>idr/ida Functions</title>
!Pinclude/linux/idr.h idr sync
!Plib/idr.c IDA description
!Elib/idr.c
</sect1>
</chapter>
<chapter id="mm">
......@@ -257,7 +263,8 @@ X!Earch/x86/kernel/mca_32.c
!Iblock/blk-sysfs.c
!Eblock/blk-settings.c
!Eblock/blk-exec.c
!Eblock/blk-barrier.c
!Eblock/blk-flush.c
!Eblock/blk-lib.c
!Eblock/blk-tag.c
!Iblock/blk-tag.c
!Eblock/blk-integrity.c
......
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
<book id="mac80211-developers-guide">
<bookinfo>
<title>The mac80211 subsystem for kernel developers</title>
<authorgroup>
<author>
<firstname>Johannes</firstname>
<surname>Berg</surname>
<affiliation>
<address><email>johannes@sipsolutions.net</email></address>
</affiliation>
</author>
</authorgroup>
<copyright>
<year>2007-2009</year>
<holder>Johannes Berg</holder>
</copyright>
<legalnotice>
<para>
This documentation is free software; you can redistribute
it and/or modify it under the terms of the GNU General Public
License version 2 as published by the Free Software Foundation.
</para>
<para>
This documentation is distributed in the hope that it will be
useful, but WITHOUT ANY WARRANTY; without even the implied
warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU General Public License for more details.
</para>
<para>
You should have received a copy of the GNU General Public
License along with this documentation; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
MA 02111-1307 USA
</para>
<para>
For more details see the file COPYING in the source
distribution of Linux.
</para>
</legalnotice>
<abstract>
!Pinclude/net/mac80211.h Introduction
!Pinclude/net/mac80211.h Warning
</abstract>
</bookinfo>
<toc></toc>
<!--
Generally, this document shall be ordered by increasing complexity.
It is important to note that readers should be able to read only
the first few sections to get a working driver and only advanced
usage should require reading the full document.
-->
<part>
<title>The basic mac80211 driver interface</title>
<partintro>
<para>
You should read and understand the information contained
within this part of the book while implementing a driver.
In some chapters, advanced usage is noted, that may be
skipped at first.
</para>
<para>
This part of the book only covers station and monitor mode
functionality, additional information required to implement
the other modes is covered in the second part of the book.
</para>
</partintro>
<chapter id="basics">
<title>Basic hardware handling</title>
<para>TBD</para>
<para>
This chapter shall contain information on getting a hw
struct allocated and registered with mac80211.
</para>
<para>
Since it is required to allocate rates/modes before registering
a hw struct, this chapter shall also contain information on setting
up the rate/mode structs.
</para>
<para>
Additionally, some discussion about the callbacks and
the general programming model should be in here, including
the definition of ieee80211_ops which will be referred to
a lot.
</para>
<para>
Finally, a discussion of hardware capabilities should be done
with references to other parts of the book.
</para>
<!-- intentionally multiple !F lines to get proper order -->
!Finclude/net/mac80211.h ieee80211_hw
!Finclude/net/mac80211.h ieee80211_hw_flags
!Finclude/net/mac80211.h SET_IEEE80211_DEV
!Finclude/net/mac80211.h SET_IEEE80211_PERM_ADDR
!Finclude/net/mac80211.h ieee80211_ops
!Finclude/net/mac80211.h ieee80211_alloc_hw
!Finclude/net/mac80211.h ieee80211_register_hw
!Finclude/net/mac80211.h ieee80211_get_tx_led_name
!Finclude/net/mac80211.h ieee80211_get_rx_led_name
!Finclude/net/mac80211.h ieee80211_get_assoc_led_name
!Finclude/net/mac80211.h ieee80211_get_radio_led_name
!Finclude/net/mac80211.h ieee80211_unregister_hw
!Finclude/net/mac80211.h ieee80211_free_hw
</chapter>
<chapter id="phy-handling">
<title>PHY configuration</title>
<para>TBD</para>
<para>
This chapter should describe PHY handling including
start/stop callbacks and the various structures used.
</para>
!Finclude/net/mac80211.h ieee80211_conf
!Finclude/net/mac80211.h ieee80211_conf_flags
</chapter>
<chapter id="iface-handling">
<title>Virtual interfaces</title>
<para>TBD</para>
<para>
This chapter should describe virtual interface basics
that are relevant to the driver (VLANs, MGMT etc are not.)
It should explain the use of the add_iface/remove_iface
callbacks as well as the interface configuration callbacks.
</para>
<para>Things related to AP mode should be discussed there.</para>
<para>
Things related to supporting multiple interfaces should be
in the appropriate chapter, a BIG FAT note should be here about
this though and the recommendation to allow only a single
interface in STA mode at first!
</para>
!Finclude/net/mac80211.h ieee80211_vif
</chapter>
<chapter id="rx-tx">
<title>Receive and transmit processing</title>
<sect1>
<title>what should be here</title>
<para>TBD</para>
<para>
This should describe the receive and transmit
paths in mac80211/the drivers as well as
transmit status handling.
</para>
</sect1>
<sect1>
<title>Frame format</title>
!Pinclude/net/mac80211.h Frame format
</sect1>
<sect1>
<title>Packet alignment</title>
!Pnet/mac80211/rx.c Packet alignment
</sect1>
<sect1>
<title>Calling into mac80211 from interrupts</title>
!Pinclude/net/mac80211.h Calling mac80211 from interrupts
</sect1>
<sect1>
<title>functions/definitions</title>
!Finclude/net/mac80211.h ieee80211_rx_status
!Finclude/net/mac80211.h mac80211_rx_flags
!Finclude/net/mac80211.h ieee80211_tx_info
!Finclude/net/mac80211.h ieee80211_rx
!Finclude/net/mac80211.h ieee80211_rx_irqsafe
!Finclude/net/mac80211.h ieee80211_tx_status
!Finclude/net/mac80211.h ieee80211_tx_status_irqsafe
!Finclude/net/mac80211.h ieee80211_rts_get
!Finclude/net/mac80211.h ieee80211_rts_duration
!Finclude/net/mac80211.h ieee80211_ctstoself_get
!Finclude/net/mac80211.h ieee80211_ctstoself_duration
!Finclude/net/mac80211.h ieee80211_generic_frame_duration
!Finclude/net/mac80211.h ieee80211_wake_queue
!Finclude/net/mac80211.h ieee80211_stop_queue
!Finclude/net/mac80211.h ieee80211_wake_queues
!Finclude/net/mac80211.h ieee80211_stop_queues
</sect1>
</chapter>
<chapter id="filters">
<title>Frame filtering</title>
!Pinclude/net/mac80211.h Frame filtering
!Finclude/net/mac80211.h ieee80211_filter_flags
</chapter>
</part>
<part id="advanced">
<title>Advanced driver interface</title>
<partintro>
<para>
Information contained within this part of the book is
of interest only for advanced interaction of mac80211
with drivers to exploit more hardware capabilities and
improve performance.
</para>
</partintro>
<chapter id="hardware-crypto-offload">
<title>Hardware crypto acceleration</title>
!Pinclude/net/mac80211.h Hardware crypto acceleration
<!-- intentionally multiple !F lines to get proper order -->
!Finclude/net/mac80211.h set_key_cmd
!Finclude/net/mac80211.h ieee80211_key_conf
!Finclude/net/mac80211.h ieee80211_key_alg
!Finclude/net/mac80211.h ieee80211_key_flags
</chapter>
<chapter id="powersave">
<title>Powersave support</title>
!Pinclude/net/mac80211.h Powersave support
</chapter>
<chapter id="beacon-filter">
<title>Beacon filter support</title>
!Pinclude/net/mac80211.h Beacon filter support
!Finclude/net/mac80211.h ieee80211_beacon_loss
</chapter>
<chapter id="qos">
<title>Multiple queues and QoS support</title>
<para>TBD</para>
!Finclude/net/mac80211.h ieee80211_tx_queue_params
</chapter>
<chapter id="AP">
<title>Access point mode support</title>
<para>TBD</para>
<para>Some parts of the if_conf should be discussed here instead</para>
<para>
Insert notes about VLAN interfaces with hw crypto here or
in the hw crypto chapter.
</para>
!Finclude/net/mac80211.h ieee80211_get_buffered_bc
!Finclude/net/mac80211.h ieee80211_beacon_get
</chapter>
<chapter id="multi-iface">
<title>Supporting multiple virtual interfaces</title>
<para>TBD</para>
<para>
Note: WDS with identical MAC address should almost always be OK
</para>
<para>
Insert notes about having multiple virtual interfaces with
different MAC addresses here, note which configurations are
supported by mac80211, add notes about supporting hw crypto
with it.
</para>
</chapter>
<chapter id="hardware-scan-offload">
<title>Hardware scan offload</title>
<para>TBD</para>
!Finclude/net/mac80211.h ieee80211_scan_completed
</chapter>
</part>
<part id="rate-control">
<title>Rate control interface</title>
<partintro>
<para>TBD</para>
<para>
This part of the book describes the rate control algorithm
interface and how it relates to mac80211 and drivers.
</para>
</partintro>
<chapter id="dummy">
<title>dummy chapter</title>
<para>TBD</para>
</chapter>
</part>
<part id="internal">
<title>Internals</title>
<partintro>
<para>TBD</para>
<para>
This part of the book describes mac80211 internals.
</para>
</partintro>
<chapter id="key-handling">
<title>Key handling</title>
<sect1>
<title>Key handling basics</title>
!Pnet/mac80211/key.c Key handling basics
</sect1>
<sect1>
<title>MORE TBD</title>
<para>TBD</para>
</sect1>
</chapter>
<chapter id="rx-processing">
<title>Receive processing</title>
<para>TBD</para>
</chapter>
<chapter id="tx-processing">
<title>Transmit processing</title>
<para>TBD</para>
</chapter>
<chapter id="sta-info">
<title>Station info handling</title>
<sect1>
<title>Programming information</title>
!Fnet/mac80211/sta_info.h sta_info
!Fnet/mac80211/sta_info.h ieee80211_sta_info_flags
</sect1>
<sect1>
<title>STA information lifetime rules</title>
!Pnet/mac80211/sta_info.c STA information lifetime rules
</sect1>
</chapter>
<chapter id="synchronisation">
<title>Synchronisation</title>
<para>TBD</para>
<para>Locking, lots of RCU</para>
</chapter>
</part>
</book>
......@@ -21,6 +21,7 @@
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <signal.h>
#include <linux/genetlink.h>
......@@ -266,11 +267,13 @@ int main(int argc, char *argv[])
int containerset = 0;
char containerpath[1024];
int cfd = 0;
int forking = 0;
sigset_t sigset;
struct msgtemplate msg;
while (1) {
c = getopt(argc, argv, "qdiw:r:m:t:p:vlC:");
while (!forking) {
c = getopt(argc, argv, "qdiw:r:m:t:p:vlC:c:");
if (c < 0)
break;
......@@ -319,6 +322,28 @@ int main(int argc, char *argv[])
err(1, "Invalid pid\n");
cmd_type = TASKSTATS_CMD_ATTR_PID;
break;
case 'c':
/* Block SIGCHLD for sigwait() later */
if (sigemptyset(&sigset) == -1)
err(1, "Failed to empty sigset");
if (sigaddset(&sigset, SIGCHLD))
err(1, "Failed to set sigchld in sigset");
sigprocmask(SIG_BLOCK, &sigset, NULL);
/* fork/exec a child */
tid = fork();
if (tid < 0)
err(1, "Fork failed\n");
if (tid == 0)
if (execvp(argv[optind - 1],
&argv[optind - 1]) < 0)
exit(-1);
/* Set the command type and avoid further processing */
cmd_type = TASKSTATS_CMD_ATTR_PID;
forking = 1;
break;
case 'v':
printf("debug on\n");
dbg = 1;
......@@ -370,6 +395,15 @@ int main(int argc, char *argv[])
goto err;
}
/*
* If we forked a child, wait for it to exit. Cannot use waitpid()
* as all the delicious data would be reaped as part of the wait
*/
if (tid && forking) {
int sig_received;
sigwait(&sigset, &sig_received);
}
if (tid) {
rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET,
cmd_type, &tid, sizeof(__u32));
......
Freebird-1.1 is produced by Legned(C) ,Inc.
Freebird-1.1 is produced by Legend(C), Inc.
http://web.archive.org/web/*/http://www.legend.com.cn
and software/linux mainatined by Coventive(C),Inc.
and software/linux maintained by Coventive(C), Inc.
(http://www.coventive.com)
Based on the Nicolas's strongarm kernel tree.
......
00-INDEX
- This file
barrier.txt
- I/O Barriers
biodoc.txt
- Notes on the Generic Block Layer Rewrite in Linux 2.5
capability.txt
......@@ -16,3 +14,5 @@ stat.txt
- Block layer statistics in /sys/block/<dev>/stat
switching-sched.txt
- Switching I/O schedulers at runtime
writeback_cache_control.txt
- Control of volatile write back caches
I/O Barriers
============
Tejun Heo <htejun@gmail.com>, July 22 2005
I/O barrier requests are used to guarantee ordering around the barrier
requests. Unless you're crazy enough to use disk drives for
implementing synchronization constructs (wow, sounds interesting...),
the ordering is meaningful only for write requests for things like
journal checkpoints. All requests queued before a barrier request
must be finished (made it to the physical medium) before the barrier
request is started, and all requests queued after the barrier request
must be started only after the barrier request is finished (again,
made it to the physical medium).
In other words, I/O barrier requests have the following two properties.
1. Request ordering
Requests cannot pass the barrier request. Preceding requests are
processed before the barrier and following requests after.
Depending on what features a drive supports, this can be done in one
of the following three ways.
i. For devices which have queue depth greater than 1 (TCQ devices) and
support ordered tags, block layer can just issue the barrier as an
ordered request and the lower level driver, controller and drive
itself are responsible for making sure that the ordering constraint is
met. Most modern SCSI controllers/drives should support this.
NOTE: SCSI ordered tag isn't currently used due to limitation in the
SCSI midlayer, see the following random notes section.
ii. For devices which have queue depth greater than 1 but don't
support ordered tags, block layer ensures that the requests preceding
a barrier request finishes before issuing the barrier request. Also,
it defers requests following the barrier until the barrier request is
finished. Older SCSI controllers/drives and SATA drives fall in this
category.
iii. Devices which have queue depth of 1. This is a degenerate case
of ii. Just keeping issue order suffices. Ancient SCSI
controllers/drives and IDE drives are in this category.
2. Forced flushing to physical medium
Again, if you're not gonna do synchronization with disk drives (dang,
it sounds even more appealing now!), the reason you use I/O barriers
is mainly to protect filesystem integrity when power failure or some
other events abruptly stop the drive from operating and possibly make
the drive lose data in its cache. So, I/O barriers need to guarantee
that requests actually get written to non-volatile medium in order.
There are four cases,
i. No write-back cache. Keeping requests ordered is enough.
ii. Write-back cache but no flush operation. There's no way to
guarantee physical-medium commit order. This kind of devices can't to
I/O barriers.
iii. Write-back cache and flush operation but no FUA (forced unit
access). We need two cache flushes - before and after the barrier
request.
iv. Write-back cache, flush operation and FUA. We still need one
flush to make sure requests preceding a barrier are written to medium,
but post-barrier flush can be avoided by using FUA write on the
barrier itself.
How to support barrier requests in drivers
------------------------------------------
All barrier handling is done inside block layer proper. All low level
drivers have to are implementing its prepare_flush_fn and using one
the following two functions to indicate what barrier type it supports
and how to prepare flush requests. Note that the term 'ordered' is
used to indicate the whole sequence of performing barrier requests
including draining and flushing.
typedef void (prepare_flush_fn)(struct request_queue *q, struct request *rq);
int blk_queue_ordered(struct request_queue *q, unsigned ordered,
prepare_flush_fn *prepare_flush_fn);
@q : the queue in question
@ordered : the ordered mode the driver/device supports
@prepare_flush_fn : this function should prepare @rq such that it
flushes cache to physical medium when executed
For example, SCSI disk driver's prepare_flush_fn looks like the
following.
static void sd_prepare_flush(struct request_queue *q, struct request *rq)
{
memset(rq->cmd, 0, sizeof(rq->cmd));
rq->cmd_type = REQ_TYPE_BLOCK_PC;
rq->timeout = SD_TIMEOUT;
rq->cmd[0] = SYNCHRONIZE_CACHE;
rq->cmd_len = 10;
}
The following seven ordered modes are supported. The following table
shows which mode should be used depending on what features a
device/driver supports. In the leftmost column of table,
QUEUE_ORDERED_ prefix is omitted from the mode names to save space.
The table is followed by description of each mode. Note that in the
descriptions of QUEUE_ORDERED_DRAIN*, '=>' is used whereas '->' is
used for QUEUE_ORDERED_TAG* descriptions. '=>' indicates that the
preceding step must be complete before proceeding to the next step.
'->' indicates that the next step can start as soon as the previous
step is issued.
write-back cache ordered tag flush FUA
-----------------------------------------------------------------------
NONE yes/no N/A no N/A
DRAIN no no N/A N/A
DRAIN_FLUSH yes no yes no
DRAIN_FUA yes no yes yes
TAG no yes N/A N/A
TAG_FLUSH yes yes yes no
TAG_FUA yes yes yes yes
QUEUE_ORDERED_NONE
I/O barriers are not needed and/or supported.
Sequence: N/A
QUEUE_ORDERED_DRAIN
Requests are ordered by draining the request queue and cache
flushing isn't needed.
Sequence: drain => barrier
QUEUE_ORDERED_DRAIN_FLUSH
Requests are ordered by draining the request queue and both
pre-barrier and post-barrier cache flushings are needed.
Sequence: drain => preflush => barrier => postflush
QUEUE_ORDERED_DRAIN_FUA
Requests are ordered by draining the request queue and
pre-barrier cache flushing is needed. By using FUA on barrier
request, post-barrier flushing can be skipped.
Sequence: drain => preflush => barrier
QUEUE_ORDERED_TAG
Requests are ordered by ordered tag and cache flushing isn't
needed.
Sequence: barrier
QUEUE_ORDERED_TAG_FLUSH
Requests are ordered by ordered tag and both pre-barrier and
post-barrier cache flushings are needed.
Sequence: preflush -> barrier -> postflush
QUEUE_ORDERED_TAG_FUA
Requests are ordered by ordered tag and pre-barrier cache
flushing is needed. By using FUA on barrier request,
post-barrier flushing can be skipped.
Sequence: preflush -> barrier
Random notes/caveats
--------------------
* SCSI layer currently can't use TAG ordering even if the drive,
controller and driver support it. The problem is that SCSI midlayer
request dispatch function is not atomic. It releases queue lock and
switch to SCSI host lock during issue and it's possible and likely to
happen in time that requests change their relative positions. Once
this problem is solved, TAG ordering can be enabled.
* Currently, no matter which ordered mode is used, there can be only
one barrier request in progress. All I/O barriers are held off by
block layer until the previous I/O barrier is complete. This doesn't
make any difference for DRAIN ordered devices, but, for TAG ordered
devices with very high command latency, passing multiple I/O barriers
to low level *might* be helpful if they are very frequent. Well, this
certainly is a non-issue. I'm writing this just to make clear that no
two I/O barrier is ever passed to low-level driver.
* Completion order. Requests in ordered sequence are issued in order
but not required to finish in order. Barrier implementation can
handle out-of-order completion of ordered sequence. IOW, the requests
MUST be processed in order but the hardware/software completion paths
are allowed to reorder completion notifications - eg. current SCSI
midlayer doesn't preserve completion order during error handling.
* Requeueing order. Low-level drivers are free to requeue any request
after they removed it from the request queue with
blkdev_dequeue_request(). As barrier sequence should be kept in order
when requeued, generic elevator code takes care of putting requests in
order around barrier. See blk_ordered_req_seq() and
ELEVATOR_INSERT_REQUEUE handling in __elv_add_request() for details.
Note that block drivers must not requeue preceding requests while
completing latter requests in an ordered sequence. Currently, no
error checking is done against this.
* Error handling. Currently, block layer will report error to upper
layer if any of requests in an ordered sequence fails. Unfortunately,
this doesn't seem to be enough. Look at the following request flow.
QUEUE_ORDERED_TAG_FLUSH is in use.
[0] [1] [2] [3] [pre] [barrier] [post] < [4] [5] [6] ... >
still in elevator
Let's say request [2], [3] are write requests to update file system
metadata (journal or whatever) and [barrier] is used to mark that
those updates are valid. Consider the following sequence.
i. Requests [0] ~ [post] leaves the request queue and enters
low-level driver.
ii. After a while, unfortunately, something goes wrong and the
drive fails [2]. Note that any of [0], [1] and [3] could have
completed by this time, but [pre] couldn't have been finished
as the drive must process it in order and it failed before
processing that command.
iii. Error handling kicks in and determines that the error is
unrecoverable and fails [2], and resumes operation.
iv. [pre] [barrier] [post] gets processed.
v. *BOOM* power fails
The problem here is that the barrier request is *supposed* to indicate
that filesystem update requests [2] and [3] made it safely to the
physical medium and, if the machine crashes after the barrier is
written, filesystem recovery code can depend on that. Sadly, that
isn't true in this case anymore. IOW, the success of a I/O barrier
should also be dependent on success of some of the preceding requests,
where only upper layer (filesystem) knows what 'some' is.
This can be solved by implementing a way to tell the block layer which
requests affect the success of the following barrier request and
making lower lever drivers to resume operation on error only after
block layer tells it to do so.
As the probability of this happening is very low and the drive should
be faulty, implementing the fix is probably an overkill. But, still,
it's there.
* In previous drafts of barrier implementation, there was fallback
mechanism such that, if FUA or ordered TAG fails, less fancy ordered
mode can be selected and the failed barrier request is retried
automatically. The rationale for this feature was that as FUA is
pretty new in ATA world and ordered tag was never used widely, there
could be devices which report to support those features but choke when
actually given such requests.
This was removed for two reasons 1. it's an overkill 2. it's
impossible to implement properly when TAG ordering is used as low
level drivers resume after an error automatically. If it's ever
needed adding it back and modifying low level drivers accordingly
shouldn't be difficult.
Explicit volatile write back cache control
=====================================
Introduction
------------
Many storage devices, especially in the consumer market, come with volatile
write back caches. That means the devices signal I/O completion to the
operating system before data actually has hit the non-volatile storage. This
behavior obviously speeds up various workloads, but it means the operating
system needs to force data out to the non-volatile storage when it performs
a data integrity operation like fsync, sync or an unmount.
The Linux block layer provides two simple mechanisms that let filesystems
control the caching behavior of the storage device. These mechanisms are
a forced cache flush, and the Force Unit Access (FUA) flag for requests.
Explicit cache flushes
----------------------
The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from
the filesystem and will make sure the volatile cache of the storage device
has been flushed before the actual I/O operation is started. This explicitly
guarantees that previously completed write requests are on non-volatile
storage before the flagged bio starts. In addition the REQ_FLUSH flag can be
set on an otherwise empty bio structure, which causes only an explicit cache
flush without any dependent I/O. It is recommend to use
the blkdev_issue_flush() helper for a pure cache flush.
Forced Unit Access
-----------------
The REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the
filesystem and will make sure that I/O completion for this request is only
signaled after the data has been committed to non-volatile storage.
Implementation details for filesystems
--------------------------------------
Filesystems can simply set the REQ_FLUSH and REQ_FUA bits and do not have to
worry if the underlying devices need any explicit cache flushing and how
the Forced Unit Access is implemented. The REQ_FLUSH and REQ_FUA flags
may both be set on a single bio.
Implementation details for make_request_fn based block drivers
--------------------------------------------------------------
These drivers will always see the REQ_FLUSH and REQ_FUA bits as they sit
directly below the submit_bio interface. For remapping drivers the REQ_FUA
bits need to be propagated to underlying devices, and a global flush needs
to be implemented for bios with the REQ_FLUSH bit set. For real device
drivers that do not have a volatile cache the REQ_FLUSH and REQ_FUA bits
on non-empty bios can simply be ignored, and REQ_FLUSH requests without
data can be completed successfully without doing any work. Drivers for
devices with volatile caches need to implement the support for these
flags themselves without any help from the block layer.
Implementation details for request_fn based block drivers
--------------------------------------------------------------
For devices that do not support volatile write caches there is no driver
support required, the block layer completes empty REQ_FLUSH requests before
entering the driver and strips off the REQ_FLUSH and REQ_FUA bits from
requests that have a payload. For devices with volatile write caches the
driver needs to tell the block layer that it supports flushing caches by
doing:
blk_queue_flush(sdkp->disk->queue, REQ_FLUSH);
and handle empty REQ_FLUSH requests in its prep_fn/request_fn. Note that
REQ_FLUSH requests with a payload are automatically turned into a sequence
of an empty REQ_FLUSH request followed by the actual write by the block
layer. For devices that also support the FUA bit the block layer needs
to be told to pass through the REQ_FUA bit using:
blk_queue_flush(sdkp->disk->queue, REQ_FLUSH | REQ_FUA);
and the driver must handle write requests that have the REQ_FUA bit set
in prep_fn/request_fn. If the FUA bit is not natively supported the block
layer turns it into an empty REQ_FLUSH request after the actual write.
......@@ -8,12 +8,17 @@ both at leaf nodes as well as at intermediate nodes in a storage hierarchy.
Plan is to use the same cgroup based management interface for blkio controller
and based on user options switch IO policies in the background.
In the first phase, this patchset implements proportional weight time based
division of disk policy. It is implemented in CFQ. Hence this policy takes
effect only on leaf nodes when CFQ is being used.
Currently two IO control policies are implemented. First one is proportional
weight time based division of disk policy. It is implemented in CFQ. Hence
this policy takes effect only on leaf nodes when CFQ is being used. The second
one is throttling policy which can be used to specify upper IO rate limits
on devices. This policy is implemented in generic block layer and can be
used on leaf nodes as well as higher level logical devices like device mapper.
HOWTO
=====
Proportional Weight division of bandwidth
-----------------------------------------
You can do a very simple testing of running two dd threads in two different
cgroups. Here is what you can do.
......@@ -55,6 +60,35 @@ cgroups. Here is what you can do.
group dispatched to the disk. We provide fairness in terms of disk time, so
ideally io.disk_time of cgroups should be in proportion to the weight.
Throttling/Upper Limit policy
-----------------------------
- Enable Block IO controller
CONFIG_BLK_CGROUP=y
- Enable throttling in block layer
CONFIG_BLK_DEV_THROTTLING=y
- Mount blkio controller
mount -t cgroup -o blkio none /cgroup/blkio
- Specify a bandwidth rate on particular device for root group. The format
for policy is "<major>:<minor> <byes_per_second>".
echo "8:16 1048576" > /cgroup/blkio/blkio.read_bps_device
Above will put a limit of 1MB/second on reads happening for root group
on device having major/minor number 8:16.
- Run dd to read a file and see if rate is throttled to 1MB/s or not.
# dd if=/mnt/common/zerofile of=/dev/null bs=4K count=1024
# iflag=direct
1024+0 records in
1024+0 records out
4194304 bytes (4.2 MB) copied, 4.0001 s, 1.0 MB/s
Limits for writes can be put using blkio.write_bps_device file.
Various user visible config options
===================================
CONFIG_BLK_CGROUP
......@@ -68,8 +102,13 @@ CONFIG_CFQ_GROUP_IOSCHED
- Enables group scheduling in CFQ. Currently only 1 level of group
creation is allowed.
CONFIG_BLK_DEV_THROTTLING
- Enable block device throttling support in block layer.
Details of cgroup files
=======================
Proportional weight policy files
--------------------------------
- blkio.weight
- Specifies per cgroup weight. This is default weight of the group
on all the devices until and unless overridden by per device rule.
......@@ -210,6 +249,67 @@ Details of cgroup files
and minor number of the device and third field specifies the number
of times a group was dequeued from a particular device.
Throttling/Upper limit policy files
-----------------------------------
- blkio.throttle.read_bps_device
- Specifies upper limit on READ rate from the device. IO rate is
specified in bytes per second. Rules are per deivce. Following is
the format.
echo "<major>:<minor> <rate_bytes_per_second>" > /cgrp/blkio.read_bps_device
- blkio.throttle.write_bps_device
- Specifies upper limit on WRITE rate to the device. IO rate is
specified in bytes per second. Rules are per deivce. Following is
the format.
echo "<major>:<minor> <rate_bytes_per_second>" > /cgrp/blkio.write_bps_device
- blkio.throttle.read_iops_device
- Specifies upper limit on READ rate from the device. IO rate is
specified in IO per second. Rules are per deivce. Following is
the format.
echo "<major>:<minor> <rate_io_per_second>" > /cgrp/blkio.read_iops_device
- blkio.throttle.write_iops_device
- Specifies upper limit on WRITE rate to the device. IO rate is
specified in io per second. Rules are per deivce. Following is
the format.
echo "<major>:<minor> <rate_io_per_second>" > /cgrp/blkio.write_iops_device
Note: If both BW and IOPS rules are specified for a device, then IO is
subjectd to both the constraints.
- blkio.throttle.io_serviced
- Number of IOs (bio) completed to/from the disk by the group (as
seen by throttling policy). These are further divided by the type
of operation - read or write, sync or async. First two fields specify
the major and minor number of the device, third field specifies the
operation type and the fourth field specifies the number of IOs.
blkio.io_serviced does accounting as seen by CFQ and counts are in
number of requests (struct request). On the other hand,
blkio.throttle.io_serviced counts number of IO in terms of number
of bios as seen by throttling policy. These bios can later be
merged by elevator and total number of requests completed can be
lesser.
- blkio.throttle.io_service_bytes
- Number of bytes transferred to/from the disk by the group. These
are further divided by the type of operation - read or write, sync
or async. First two fields specify the major and minor number of the
device, third field specifies the operation type and the fourth field
specifies the number of bytes.
These numbers should roughly be same as blkio.io_service_bytes as
updated by CFQ. The difference between two is that
blkio.io_service_bytes will not be updated if CFQ is not operating
on request queue.
Common files among various policies
-----------------------------------
- blkio.reset_stats
- Writing an int to this file will result in resetting all the stats
for that cgroup.
......
......@@ -18,7 +18,8 @@ CONTENTS:
1.2 Why are cgroups needed ?
1.3 How are cgroups implemented ?
1.4 What does notify_on_release do ?
1.5 How do I use cgroups ?
1.5 What does clone_children do ?
1.6 How do I use cgroups ?
2. Usage Examples and Syntax
2.1 Basic Usage
2.2 Attaching processes
......@@ -293,7 +294,16 @@ notify_on_release in the root cgroup at system boot is disabled
value of their parents notify_on_release setting. The default value of
a cgroup hierarchy's release_agent path is empty.
1.5 How do I use cgroups ?
1.5 What does clone_children do ?
---------------------------------
If the clone_children flag is enabled (1) in a cgroup, then all
cgroups created beneath will call the post_clone callbacks for each
subsystem of the newly created cgroup. Usually when this callback is
implemented for a subsystem, it copies the values of the parent
subsystem, this is the case for the cpuset.
1.6 How do I use cgroups ?
--------------------------
To start a new job that is to be contained within a cgroup, using
......
......@@ -239,6 +239,7 @@ Your cooperation is appreciated.
0 = /dev/tty Current TTY device
1 = /dev/console System console
2 = /dev/ptmx PTY master multiplex
3 = /dev/ttyprintk User messages via printk TTY device
64 = /dev/cua0 Callout device for ttyS0
...
255 = /dev/cua191 Callout device for ttyS191
......@@ -2553,7 +2554,10 @@ Your cooperation is appreciated.
175 = /dev/usb/legousbtower15 16th USB Legotower device
176 = /dev/usb/usbtmc1 First USB TMC device
...
192 = /dev/usb/usbtmc16 16th USB TMC device
191 = /dev/usb/usbtmc16 16th USB TMC device
192 = /dev/usb/yurex1 First USB Yurex device
...
209 = /dev/usb/yurex16 16th USB Yurex device
240 = /dev/usb/dabusb0 First daubusb device
...
243 = /dev/usb/dabusb3 Fourth dabusb device
......
......@@ -24,7 +24,7 @@ Dynamic debug has even more useful features:
read to display the complete list of known debug statements, to help guide you
Controlling dynamic debug Behaviour
===============================
===================================
The behaviour of pr_debug()/dev_debug()s are controlled via writing to a
control file in the 'debugfs' filesystem. Thus, you must first mount the debugfs
......@@ -212,6 +212,26 @@ Note the regexp ^[-+=][scp]+$ matches a flags specification.
Note also that there is no convenient syntax to remove all
the flags at once, you need to use "-psc".
Debug messages during boot process
==================================
To be able to activate debug messages during the boot process,
even before userspace and debugfs exists, use the boot parameter:
ddebug_query="QUERY"
QUERY follows the syntax described above, but must not exceed 1023
characters. The enablement of debug messages is done as an arch_initcall.
Thus you can enable debug messages in all code processed after this
arch_initcall via this boot parameter.
On an x86 system for example ACPI enablement is a subsys_initcall and
ddebug_query="file ec.c +p"
will show early Embedded Controller transactions during ACPI setup if
your machine (typically a laptop) has an Embedded Controller.
PCI (or other devices) initialization also is a hot candidate for using
this boot parameter for debugging purposes.
Examples
========
......
......@@ -197,6 +197,54 @@ Notes:
example,
# fbset -depth 16
[Configure viafb via /proc]
---------------------------
The following files exist in /proc/viafb
supported_output_devices
This read-only file contains a full ',' seperated list containing all
output devices that could be available on your platform. It is likely
that not all of those have a connector on your hardware but it should
provide a good starting point to figure out which of those names match
a real connector.
Example:
# cat /proc/viafb/supported_output_devices
iga1/output_devices
iga2/output_devices
These two files are readable and writable. iga1 and iga2 are the two
independent units that produce the screen image. Those images can be
forwarded to one or more output devices. Reading those files is a way
to query which output devices are currently used by an iga.
Example:
# cat /proc/viafb/iga1/output_devices
If there are no output devices printed the output of this iga is lost.
This can happen for example if only one (the other) iga is used.
Writing to these files allows adjusting the output devices during
runtime. One can add new devices, remove existing ones or switch
between igas. Essentially you can write a ',' seperated list of device
names (or a single one) in the same format as the output to those
files. You can add a '+' or '-' as a prefix allowing simple addition
and removal of devices. So a prefix '+' adds the devices from your list
to the already existing ones, '-' removes the listed devices from the
existing ones and if no prefix is given it replaces all existing ones
with the listed ones. If you remove devices they are expected to turn
off. If you add devices that are already part of the other iga they are
removed there and added to the new one.
Examples:
Add CRT as output device to iga1
# echo +CRT > /proc/viafb/iga1/output_devices
Remove (turn off) DVP1 and LVDS1 as output devices of iga2
# echo -DVP1,LVDS1 > /proc/viafb/iga2/output_devices
Replace all iga1 output devices by CRT
# echo CRT > /proc/viafb/iga1/output_devices
[Bootup with viafb]:
--------------------
Add the following line to your grub.conf:
......
......@@ -502,16 +502,6 @@ Who: Thomas Gleixner <tglx@linutronix.de>
----------------------------
What: old ieee1394 subsystem (CONFIG_IEEE1394)
When: 2.6.37
Files: drivers/ieee1394/ except init_ohci1394_dma.c
Why: superseded by drivers/firewire/ (CONFIG_FIREWIRE) which offers more
features, better performance, and better security, all with smaller
and more modern code base
Who: Stefan Richter <stefanr@s5r6.in-berlin.de>
----------------------------
What: The acpi_sleep=s4_nonvs command line option
When: 2.6.37
Files: arch/x86/kernel/acpi/sleep.c
......@@ -536,3 +526,39 @@ Who: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
----------------------------
What: namespace cgroup (ns_cgroup)
When: 2.6.38
Why: The ns_cgroup leads to some problems:
* cgroup creation is out-of-control
* cgroup name can conflict when pids are looping
* it is not possible to have a single process handling
a lot of namespaces without falling in a exponential creation time
* we may want to create a namespace without creating a cgroup
The ns_cgroup is replaced by a compatibility flag 'clone_children',
where a newly created cgroup will copy the parent cgroup values.
The userspace has to manually create a cgroup and add a task to
the 'tasks' file.
Who: Daniel Lezcano <daniel.lezcano@free.fr>
----------------------------
What: iwlwifi disable_hw_scan module parameters
When: 2.6.40
Why: Hareware scan is the prefer method for iwlwifi devices for
scanning operation. Remove software scan support for all the
iwlwifi devices.
Who: Wey-Yi Guy <wey-yi.w.guy@intel.com>
----------------------------
What: access to nfsd auth cache through sys_nfsservctl or '.' files
in the 'nfsd' filesystem.
When: 2.6.40
Why: This is a legacy interface which have been replaced by a more
dynamic cache. Continuing to maintain this interface is an
unnecessary burden.
Who: NeilBrown <neilb@suse.de>
----------------------------
......@@ -349,21 +349,36 @@ call this method upon the IO completion.
--------------------------- block_device_operations -----------------------
prototypes:
int (*open) (struct inode *, struct file *);
int (*release) (struct inode *, struct file *);
int (*ioctl) (struct inode *, struct file *, unsigned, unsigned long);
int (*open) (struct block_device *, fmode_t);
int (*release) (struct gendisk *, fmode_t);
int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
int (*compat_ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
int (*direct_access) (struct block_device *, sector_t, void **, unsigned long *);
int (*media_changed) (struct gendisk *);
void (*unlock_native_capacity) (struct gendisk *);
int (*revalidate_disk) (struct gendisk *);
int (*getgeo)(struct block_device *, struct hd_geometry *);
void (*swap_slot_free_notify) (struct block_device *, unsigned long);
locking rules:
BKL bd_sem
open: yes yes
release: yes yes
ioctl: yes no
BKL bd_mutex
open: no yes
release: no yes
ioctl: no no
compat_ioctl: no no
direct_access: no no
media_changed: no no
unlock_native_capacity: no no
revalidate_disk: no no
getgeo: no no
swap_slot_free_notify: no no (see below)
media_changed, unlock_native_capacity and revalidate_disk are called only from
check_disk_change().
swap_slot_free_notify is called with swap_lock and sometimes the page lock
held.
The last two are called only from check_disk_change().
--------------------------- file_operations -------------------------------
prototypes:
......
......@@ -12,5 +12,9 @@ nfs-rdma.txt
- how to install and setup the Linux NFS/RDMA client and server software
nfsroot.txt
- short guide on setting up a diskless box with NFS root filesystem.
pnfs.txt
- short explanation of some of the internals of the pnfs client code
rpc-cache.txt
- introduction to the caching mechanisms in the sunrpc layer.
idmapper.txt
- information for configuring request-keys to be used by idmapper
=========
ID Mapper
=========
Id mapper is used by NFS to translate user and group ids into names, and to
translate user and group names into ids. Part of this translation involves
performing an upcall to userspace to request the information. Id mapper will
user request-key to perform this upcall and cache the result. The program
/usr/sbin/nfs.idmap should be called by request-key, and will perform the
translation and initialize a key with the resulting information.
NFS_USE_NEW_IDMAPPER must be selected when configuring the kernel to use this
feature.
===========
Configuring
===========
The file /etc/request-key.conf will need to be modified so /sbin/request-key can
direct the upcall. The following line should be added:
#OP TYPE DESCRIPTION CALLOUT INFO PROGRAM ARG1 ARG2 ARG3 ...
#====== ======= =============== =============== ===============================
create id_resolver * * /usr/sbin/nfs.idmap %k %d 600
This will direct all id_resolver requests to the program /usr/sbin/nfs.idmap.
The last parameter, 600, defines how many seconds into the future the key will
expire. This parameter is optional for /usr/sbin/nfs.idmap. When the timeout
is not specified, nfs.idmap will default to 600 seconds.
id mapper uses for key descriptions:
uid: Find the UID for the given user
gid: Find the GID for the given group
user: Find the user name for the given UID
group: Find the group name for the given GID
You can handle any of these individually, rather than using the generic upcall
program. If you would like to use your own program for a uid lookup then you
would edit your request-key.conf so it look similar to this:
#OP TYPE DESCRIPTION CALLOUT INFO PROGRAM ARG1 ARG2 ARG3 ...
#====== ======= =============== =============== ===============================
create id_resolver uid:* * /some/other/program %k %d 600
create id_resolver * * /usr/sbin/nfs.idmap %k %d 600
Notice that the new line was added above the line for the generic program.
request-key will find the first matching line and corresponding program. In
this case, /some/other/program will handle all uid lookups and
/usr/sbin/nfs.idmap will handle gid, user, and group lookups.
See <file:Documentation/keys-request-keys.txt> for more information about the
request-key function.
=========
nfs.idmap
=========
nfs.idmap is designed to be called by request-key, and should not be run "by
hand". This program takes two arguments, a serialized key and a key
description. The serialized key is first converted into a key_serial_t, and
then passed as an argument to keyctl_instantiate (both are part of keyutils.h).
The actual lookups are performed by functions found in nfsidmap.h. nfs.idmap
determines the correct function to call by looking at the first part of the
description string. For example, a uid lookup description will appear as
"uid:user@domain".
nfs.idmap will return 0 if the key was instantiated, and non-zero otherwise.
......@@ -159,6 +159,28 @@ ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>
Default: any
nfsrootdebug
This parameter enables debugging messages to appear in the kernel
log at boot time so that administrators can verify that the correct
NFS mount options, server address, and root path are passed to the
NFS client.
rdinit=<executable file>
To specify which file contains the program that starts system
initialization, administrators can use this command line parameter.
The default value of this parameter is "/init". If the specified
file exists and the kernel can execute it, root filesystem related
kernel command line parameters, including `nfsroot=', are ignored.
A description of the process of mounting the root file system can be
found in:
Documentation/early-userspace/README
3.) Boot Loader
......
Reference counting in pnfs:
==========================
The are several inter-related caches. We have layouts which can
reference multiple devices, each of which can reference multiple data servers.
Each data server can be referenced by multiple devices. Each device
can be referenced by multiple layouts. To keep all of this straight,
we need to reference count.
struct pnfs_layout_hdr
----------------------
The on-the-wire command LAYOUTGET corresponds to struct
pnfs_layout_segment, usually referred to by the variable name lseg.
Each nfs_inode may hold a pointer to a cache of of these layout
segments in nfsi->layout, of type struct pnfs_layout_hdr.
We reference the header for the inode pointing to it, across each
outstanding RPC call that references it (LAYOUTGET, LAYOUTRETURN,
LAYOUTCOMMIT), and for each lseg held within.
Each header is also (when non-empty) put on a list associated with
struct nfs_client (cl_layouts). Being put on this list does not bump
the reference count, as the layout is kept around by the lseg that
keeps it in the list.
deviceid_cache
--------------
lsegs reference device ids, which are resolved per nfs_client and
layout driver type. The device ids are held in a RCU cache (struct
nfs4_deviceid_cache). The cache itself is referenced across each
mount. The entries (struct nfs4_deviceid) themselves are held across
the lifetime of each lseg referencing them.
RCU is used because the deviceid is basically a write once, read many
data structure. The hlist size of 32 buckets needs better
justification, but seems reasonable given that we can have multiple
deviceid's per filesystem, and multiple filesystems per nfs_client.
The hash code is copied from the nfsd code base. A discussion of
hashing and variations of this algorithm can be found at:
http://groups.google.com/group/comp.lang.c/browse_thread/thread/9522965e2b8d3809
data server cache
-----------------
file driver devices refer to data servers, which are kept in a module
level cache. Its reference is held over the lifetime of the deviceid
pointing to it.
......@@ -136,6 +136,7 @@ Table 1-1: Process specific entries in /proc
statm Process memory status information
status Process status in human readable form
wchan If CONFIG_KALLSYMS is set, a pre-decoded wchan
pagemap Page table
stack Report full stack trace, enable via CONFIG_STACKTRACE
smaps a extension based on maps, showing the memory consumption of
each mapping
......@@ -370,17 +371,24 @@ Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 892 kB
Anonymous: 0 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
The first of these lines shows the same information as is displayed for the
mapping in /proc/PID/maps. The remaining lines show the size of the mapping,
the amount of the mapping that is currently resident in RAM, the "proportional
set size” (divide each shared page by the number of processes sharing it), the
number of clean and dirty shared pages in the mapping, and the number of clean
and dirty private pages in the mapping. The "Referenced" indicates the amount
of memory currently marked as referenced or accessed.
The first of these lines shows the same information as is displayed for the
mapping in /proc/PID/maps. The remaining lines show the size of the mapping
(size), the amount of the mapping that is currently resident in RAM (RSS), the
process' proportional share of this mapping (PSS), the number of clean and
dirty private pages in the mapping. Note that even a page which is part of a
MAP_SHARED mapping, but has only a single pte mapped, i.e. is currently used
by only one process, is accounted as private and not as shared. "Referenced"
indicates the amount of memory currently marked as referenced or accessed.
"Anonymous" shows the amount of memory that does not belong to any file. Even
a mapping associated with a file may contain anonymous pages: when MAP_PRIVATE
and a page is modified, the file page is replaced by a private anonymous copy.
"Swap" shows how much would-be-anonymous memory is also used, but out on
swap.
This file is only present if the CONFIG_MMU kernel configuration option is
enabled.
......@@ -397,6 +405,9 @@ To clear the bits for the file mapped pages associated with the process
> echo 3 > /proc/PID/clear_refs
Any other value written to /proc/PID/clear_refs will have no effect.
The /proc/pid/pagemap gives the PFN, which can be used to find the pageflags
using /proc/kpageflags and number of times a page is mapped using
/proc/kpagecount. For detailed explanation, see Documentation/vm/pagemap.txt.
1.2 Kernel data
---------------
......
......@@ -62,10 +62,10 @@ replicas continue to be exactly same.
# mount /dev/sd0 /tmp/a
#ls /tmp/a
t1 t2 t2
t1 t2 t3
#ls /mnt/a
t1 t2 t2
t1 t2 t3
Note that the mount has propagated to the mount at /mnt as well.
......
Kernel driver ltc4261
=====================
Supported chips:
* Linear Technology LTC4261
Prefix: 'ltc4261'
Addresses scanned: -
Datasheet:
http://cds.linear.com/docs/Datasheet/42612fb.pdf
Author: Guenter Roeck <guenter.roeck@ericsson.com>
Description
-----------
The LTC4261/LTC4261-2 negative voltage Hot Swap controllers allow a board
to be safely inserted and removed from a live backplane.
Usage Notes
-----------
This driver does not probe for LTC4261 devices, since there is no register
which can be safely used to identify the chip. You will have to instantiate
the devices explicitly.
Example: the following will load the driver for an LTC4261 at address 0x10
on I2C bus #1:
$ modprobe ltc4261
$ echo ltc4261 0x10 > /sys/bus/i2c/devices/i2c-1/new_device
Sysfs entries
-------------
Voltage readings provided by this driver are reported as obtained from the ADC
registers. If a set of voltage divider resistors is installed, calculate the
real voltage by multiplying the reported value with (R1+R2)/R2, where R1 is the
value of the divider resistor against the measured voltage and R2 is the value
of the divider resistor against Ground.
Current reading provided by this driver is reported as obtained from the ADC
Current Sense register. The reported value assumes that a 1 mOhm sense resistor
is installed. If a different sense resistor is installed, calculate the real
current by dividing the reported value by the sense resistor value in mOhm.
The chip has two voltage sensors, but only one set of voltage alarm status bits.
In many many designs, those alarms are associated with the ADIN2 sensor, due to
the proximity of the ADIN2 pin to the OV pin. ADIN2 is, however, not available
on all chip variants. To ensure that the alarm condition is reported to the user,
report it with both voltage sensors.
in1_input ADIN2 voltage (mV)
in1_min_alarm ADIN/ADIN2 Undervoltage alarm
in1_max_alarm ADIN/ADIN2 Overvoltage alarm
in2_input ADIN voltage (mV)
in2_min_alarm ADIN/ADIN2 Undervoltage alarm
in2_max_alarm ADIN/ADIN2 Overvoltage alarm
curr1_input SENSE current (mA)
curr1_alarm SENSE overcurrent alarm
N-Trig touchscreen Driver
-------------------------
Copyright (c) 2008-2010 Rafi Rubin <rafi@seas.upenn.edu>
Copyright (c) 2009-2010 Stephane Chatty
This driver provides support for N-Trig pen and multi-touch sensors. Single
and multi-touch events are translated to the appropriate protocols for
the hid and input systems. Pen events are sufficiently hid compliant and
are left to the hid core. The driver also provides additional filtering
and utility functions accessible with sysfs and module parameters.
This driver has been reported to work properly with multiple N-Trig devices
attached.
Parameters
----------
Note: values set at load time are global and will apply to all applicable
devices. Adjusting parameters with sysfs will override the load time values,
but only for that one device.
The following parameters are used to configure filters to reduce noise:
activate_slack number of fingers to ignore before processing events
activation_height size threshold to activate immediately
activation_width
min_height size threshold bellow which fingers are ignored
min_width both to decide activation and during activity
deactivate_slack the number of "no contact" frames to ignore before
propagating the end of activity events
When the last finger is removed from the device, it sends a number of empty
frames. By holding off on deactivation for a few frames we can tolerate false
erroneous disconnects, where the sensor may mistakenly not detect a finger that
is still present. Thus deactivate_slack addresses problems where a users might
see breaks in lines during drawing, or drop an object during a long drag.
Additional sysfs items
----------------------
These nodes just provide easy access to the ranges reported by the device.
sensor_logical_height the range for positions reported during activity
sensor_logical_width
sensor_physical_height internal ranges not used for normal events but
sensor_physical_width useful for tuning
All N-Trig devices with product id of 1 report events in the ranges of
X: 0-9600
Y: 0-7200
However not all of these devices have the same physical dimensions. Most
seem to be 12" sensors (Dell Latitude XT and XT2 and the HP TX2), and
at least one model (Dell Studio 17) has a 17" sensor. The ratio of physical
to logical sizes is used to adjust the size based filter parameters.
Filtering
---------
With the release of the early multi-touch firmwares it became increasingly
obvious that these sensors were prone to erroneous events. Users reported
seeing both inappropriately dropped contact and ghosts, contacts reported
where no finger was actually touching the screen.
Deactivation slack helps prevent dropped contact for single touch use, but does
not address the problem of dropping one of more contacts while other contacts
are still active. Drops in the multi-touch context require additional
processing and should be handled in tandem with tacking.
As observed ghost contacts are similar to actual use of the sensor, but they
seem to have different profiles. Ghost activity typically shows up as small
short lived touches. As such, I assume that the longer the continuous stream
of events the more likely those events are from a real contact, and that the
larger the size of each contact the more likely it is real. Balancing the
goals of preventing ghosts and accepting real events quickly (to minimize
user observable latency), the filter accumulates confidence for incoming
events until it hits thresholds and begins propagating. In the interest in
minimizing stored state as well as the cost of operations to make a decision,
I've kept that decision simple.
Time is measured in terms of the number of fingers reported, not frames since
the probability of multiple simultaneous ghosts is expected to drop off
dramatically with increasing numbers. Rather than accumulate weight as a
function of size, I just use it as a binary threshold. A sufficiently large
contact immediately overrides the waiting period and leads to activation.
Setting the activation size thresholds to large values will result in deciding
primarily on activation slack. If you see longer lived ghosts, turning up the
activation slack while reducing the size thresholds may suffice to eliminate
the ghosts while keeping the screen quite responsive to firm taps.
Contacts continue to be filtered with min_height and min_width even after
the initial activation filter is satisfied. The intent is to provide
a mechanism for filtering out ghosts in the form of an extra finger while
you actually are using the screen. In practice this sort of ghost has
been far less problematic or relatively rare and I've left the defaults
set to 0 for both parameters, effectively turning off that filter.
I don't know what the optimal values are for these filters. If the defaults
don't work for you, please play with the parameters. If you do find other
values more comfortable, I would appreciate feedback.
The calibration of these devices does drift over time. If ghosts or contact
dropping worsen and interfere with the normal usage of your device, try
recalibrating it.
Calibration
-----------
The N-Trig windows tools provide calibration and testing routines. Also an
unofficial unsupported set of user space tools including a calibrator is
available at:
http://code.launchpad.net/~rafi-seas/+junk/ntrig_calib
Tracking
--------
As of yet, all tested N-Trig firmwares do not track fingers. When multiple
contacts are active they seem to be sorted primarily by Y position.
......@@ -43,10 +43,11 @@ parameter is applicable:
AVR32 AVR32 architecture is enabled.
AX25 Appropriate AX.25 support is enabled.
BLACKFIN Blackfin architecture is enabled.
DRM Direct Rendering Management support is enabled.
EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
EFI EFI Partitioning (GPT) is enabled
EIDE EIDE/ATAPI support is enabled.
DRM Direct Rendering Management support is enabled.
DYNAMIC_DEBUG Build in debug messages and enable them at runtime
FB The frame buffer device is enabled.
GCOV GCOV profiling is enabled.
HW Appropriate hardware is enabled.
......@@ -570,6 +571,10 @@ and is between 256 and 4096 characters. It is defined in the file
Format: <port#>,<type>
See also Documentation/input/joystick-parport.txt
ddebug_query= [KNL,DYNAMIC_DEBUG] Enable debug messages at early boot
time. See Documentation/dynamic-debug-howto.txt for
details.
debug [KNL] Enable kernel debugging (events log level).
debug_locks_verbose=
......@@ -1126,9 +1131,13 @@ and is between 256 and 4096 characters. It is defined in the file
kvm.oos_shadow= [KVM] Disable out-of-sync shadow paging.
Default is 1 (enabled)
kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
kvm.mmu_audit= [KVM] This is a R/W parameter which allows audit
KVM MMU at runtime.
Default is 0 (off)
kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
Default is 1 (enabled)
kvm-amd.npt= [KVM,AMD] Disable nested paging (virtualized MMU)
for all guests.
Default is 1 (enabled) if in 64bit or 32bit-PAE mode
......@@ -1532,12 +1541,15 @@ and is between 256 and 4096 characters. It is defined in the file
1 to enable accounting
Default value is 0.
nfsaddrs= [NFS]
nfsaddrs= [NFS] Deprecated. Use ip= instead.
See Documentation/filesystems/nfs/nfsroot.txt.
nfsroot= [NFS] nfs root filesystem for disk-less boxes.
See Documentation/filesystems/nfs/nfsroot.txt.
nfsrootdebug [NFS] enable nfsroot debugging messages.
See Documentation/filesystems/nfs/nfsroot.txt.
nfs.callback_tcpport=
[NFS] set the TCP port on which the NFSv4 callback
channel should listen.
......@@ -1693,6 +1705,8 @@ and is between 256 and 4096 characters. It is defined in the file
nojitter [IA64] Disables jitter checking for ITC timers.
no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver
nolapic [X86-32,APIC] Do not enable or use the local APIC.
nolapic_timer [X86-32,APIC] Do not use the local APIC timer.
......@@ -1713,7 +1727,7 @@ and is between 256 and 4096 characters. It is defined in the file
norandmaps Don't use address space randomization. Equivalent to
echo 0 > /proc/sys/kernel/randomize_va_space
noreplace-paravirt [X86-32,PV_OPS] Don't patch paravirt_ops
noreplace-paravirt [X86,IA-64,PV_OPS] Don't patch paravirt_ops
noreplace-smp [X86-32,SMP] Don't replace SMP instructions
with UP alternatives
......@@ -2370,6 +2384,15 @@ and is between 256 and 4096 characters. It is defined in the file
switches= [HW,M68k]
sysfs.deprecated=0|1 [KNL]
Enable/disable old style sysfs layout for old udev
on older distributions. When this option is enabled
very new udev will not work anymore. When this option
is disabled (or CONFIG_SYSFS_DEPRECATED not compiled)
in older udev will not work anymore.
Default depends on CONFIG_SYSFS_DEPRECATED_V2 set in
the kernel configuration.
sysrq_always_enabled
[KNL]
Ignore sysrq setting - this boot parameter will
......@@ -2418,7 +2441,7 @@ and is between 256 and 4096 characters. It is defined in the file
topology informations if the hardware supports these.
The scheduler will make use of these informations and
e.g. base its process migration decisions on it.
Default is off.
Default is on.
tp720= [HW,PS2]
......
......@@ -320,13 +320,13 @@ struct kvm_translation {
4.15 KVM_INTERRUPT
Capability: basic
Architectures: x86
Architectures: x86, ppc
Type: vcpu ioctl
Parameters: struct kvm_interrupt (in)
Returns: 0 on success, -1 on error
Queues a hardware interrupt vector to be injected. This is only
useful if in-kernel local APIC is not used.
useful if in-kernel local APIC or equivalent is not used.
/* for KVM_INTERRUPT */
struct kvm_interrupt {
......@@ -334,8 +334,37 @@ struct kvm_interrupt {
__u32 irq;
};
X86:
Note 'irq' is an interrupt vector, not an interrupt pin or line.
PPC:
Queues an external interrupt to be injected. This ioctl is overleaded
with 3 different irq values:
a) KVM_INTERRUPT_SET
This injects an edge type external interrupt into the guest once it's ready
to receive interrupts. When injected, the interrupt is done.
b) KVM_INTERRUPT_UNSET
This unsets any pending interrupt.
Only available with KVM_CAP_PPC_UNSET_IRQ.
c) KVM_INTERRUPT_SET_LEVEL
This injects a level type external interrupt into the guest context. The
interrupt stays pending until a specific ioctl with KVM_INTERRUPT_UNSET
is triggered.
Only available with KVM_CAP_PPC_IRQ_LEVEL.
Note that any value for 'irq' other than the ones stated above is invalid
and incurs unexpected behavior.
4.16 KVM_DEBUG_GUEST
Capability: basic
......@@ -1013,8 +1042,9 @@ number is just right, the 'nent' field is adjusted to the number of valid
entries in the 'entries' array, which is then filled.
The entries returned are the host cpuid as returned by the cpuid instruction,
with unknown or unsupported features masked out. The fields in each entry
are defined as follows:
with unknown or unsupported features masked out. Some features (for example,
x2apic), may not be present in the host cpu, but are exposed by kvm if it can
emulate them efficiently. The fields in each entry are defined as follows:
function: the eax value used to obtain the entry
index: the ecx value used to obtain the entry (for entries that are
......@@ -1032,6 +1062,29 @@ are defined as follows:
eax, ebx, ecx, edx: the values returned by the cpuid instruction for
this function/index combination
4.46 KVM_PPC_GET_PVINFO
Capability: KVM_CAP_PPC_GET_PVINFO
Architectures: ppc
Type: vm ioctl
Parameters: struct kvm_ppc_pvinfo (out)
Returns: 0 on success, !0 on error
struct kvm_ppc_pvinfo {
__u32 flags;
__u32 hcall[4];
__u8 pad[108];
};
This ioctl fetches PV specific information that need to be passed to the guest
using the device tree or other means from vm context.
For now the only implemented piece of information distributed here is an array
of 4 instructions that make up a hypercall.
If any additional field gets added to this structure later on, a bit for that
additional piece of information will be set in the flags bitmap.
5. The kvm_run structure
Application code obtains a pointer to the kvm_run structure by
......
The PPC KVM paravirtual interface
=================================
The basic execution principle by which KVM on PowerPC works is to run all kernel
space code in PR=1 which is user space. This way we trap all privileged
instructions and can emulate them accordingly.
Unfortunately that is also the downfall. There are quite some privileged
instructions that needlessly return us to the hypervisor even though they
could be handled differently.
This is what the PPC PV interface helps with. It takes privileged instructions
and transforms them into unprivileged ones with some help from the hypervisor.
This cuts down virtualization costs by about 50% on some of my benchmarks.
The code for that interface can be found in arch/powerpc/kernel/kvm*
Querying for existence
======================
To find out if we're running on KVM or not, we leverage the device tree. When
Linux is running on KVM, a node /hypervisor exists. That node contains a
compatible property with the value "linux,kvm".
Once you determined you're running under a PV capable KVM, you can now use
hypercalls as described below.
KVM hypercalls
==============
Inside the device tree's /hypervisor node there's a property called
'hypercall-instructions'. This property contains at most 4 opcodes that make
up the hypercall. To call a hypercall, just call these instructions.
The parameters are as follows:
Register IN OUT
r0 - volatile
r3 1st parameter Return code
r4 2nd parameter 1st output value
r5 3rd parameter 2nd output value
r6 4th parameter 3rd output value
r7 5th parameter 4th output value
r8 6th parameter 5th output value
r9 7th parameter 6th output value
r10 8th parameter 7th output value
r11 hypercall number 8th output value
r12 - volatile
Hypercall definitions are shared in generic code, so the same hypercall numbers
apply for x86 and powerpc alike with the exception that each KVM hypercall
also needs to be ORed with the KVM vendor code which is (42 << 16).
Return codes can be as follows:
Code Meaning
0 Success
12 Hypercall not implemented
<0 Error
The magic page
==============
To enable communication between the hypervisor and guest there is a new shared
page that contains parts of supervisor visible register state. The guest can
map this shared page using the KVM hypercall KVM_HC_PPC_MAP_MAGIC_PAGE.
With this hypercall issued the guest always gets the magic page mapped at the
desired location in effective and physical address space. For now, we always
map the page to -4096. This way we can access it using absolute load and store
functions. The following instruction reads the first field of the magic page:
ld rX, -4096(0)
The interface is designed to be extensible should there be need later to add
additional registers to the magic page. If you add fields to the magic page,
also define a new hypercall feature to indicate that the host can give you more
registers. Only if the host supports the additional features, make use of them.
The magic page has the following layout as described in
arch/powerpc/include/asm/kvm_para.h:
struct kvm_vcpu_arch_shared {
__u64 scratch1;
__u64 scratch2;
__u64 scratch3;
__u64 critical; /* Guest may not get interrupts if == r1 */
__u64 sprg0;
__u64 sprg1;
__u64 sprg2;
__u64 sprg3;
__u64 srr0;
__u64 srr1;
__u64 dar;
__u64 msr;
__u32 dsisr;
__u32 int_pending; /* Tells the guest if we have an interrupt */
};
Additions to the page must only occur at the end. Struct fields are always 32
or 64 bit aligned, depending on them being 32 or 64 bit wide respectively.
Magic page features
===================
When mapping the magic page using the KVM hypercall KVM_HC_PPC_MAP_MAGIC_PAGE,
a second return value is passed to the guest. This second return value contains
a bitmap of available features inside the magic page.
The following enhancements to the magic page are currently available:
KVM_MAGIC_FEAT_SR Maps SR registers r/w in the magic page
For enhanced features in the magic page, please check for the existence of the
feature before using them!
MSR bits
========
The MSR contains bits that require hypervisor intervention and bits that do
not require direct hypervisor intervention because they only get interpreted
when entering the guest or don't have any impact on the hypervisor's behavior.
The following bits are safe to be set inside the guest:
MSR_EE
MSR_RI
MSR_CR
MSR_ME
If any other bit changes in the MSR, please still use mtmsr(d).
Patched instructions
====================
The "ld" and "std" instructions are transormed to "lwz" and "stw" instructions
respectively on 32 bit systems with an added offset of 4 to accomodate for big
endianness.
The following is a list of mapping the Linux kernel performs when running as
guest. Implementing any of those mappings is optional, as the instruction traps
also act on the shared page. So calling privileged instructions still works as
before.
From To
==== ==
mfmsr rX ld rX, magic_page->msr
mfsprg rX, 0 ld rX, magic_page->sprg0
mfsprg rX, 1 ld rX, magic_page->sprg1
mfsprg rX, 2 ld rX, magic_page->sprg2
mfsprg rX, 3 ld rX, magic_page->sprg3
mfsrr0 rX ld rX, magic_page->srr0
mfsrr1 rX ld rX, magic_page->srr1
mfdar rX ld rX, magic_page->dar
mfdsisr rX lwz rX, magic_page->dsisr
mtmsr rX std rX, magic_page->msr
mtsprg 0, rX std rX, magic_page->sprg0
mtsprg 1, rX std rX, magic_page->sprg1
mtsprg 2, rX std rX, magic_page->sprg2
mtsprg 3, rX std rX, magic_page->sprg3
mtsrr0 rX std rX, magic_page->srr0
mtsrr1 rX std rX, magic_page->srr1
mtdar rX std rX, magic_page->dar
mtdsisr rX stw rX, magic_page->dsisr
tlbsync nop
mtmsrd rX, 0 b <special mtmsr section>
mtmsr rX b <special mtmsr section>
mtmsrd rX, 1 b <special mtmsrd section>
[Book3S only]
mtsrin rX, rY b <special mtsrin section>
[BookE only]
wrteei [0|1] b <special wrteei section>
Some instructions require more logic to determine what's going on than a load
or store instruction can deliver. To enable patching of those, we keep some
RAM around where we can live translate instructions to. What happens is the
following:
1) copy emulation code to memory
2) patch that code to fit the emulated instruction
3) patch that code to return to the original pc + 4
4) patch the original instruction to branch to the new code
That way we can inject an arbitrary amount of code as replacement for a single
instruction. This allows us to check for pending interrupts when setting EE=1
for example.
此差异已折叠。
......@@ -1639,15 +1639,6 @@ static void blk_request(struct virtqueue *vq)
*/
off = out->sector * 512;
/*
* The block device implements "barriers", where the Guest indicates
* that it wants all previous writes to occur before this write. We
* don't have a way of asking our kernel to do a barrier, so we just
* synchronize all the data in the file. Pretty poor, no?
*/
if (out->type & VIRTIO_BLK_T_BARRIER)
fdatasync(vblk->fd);
/*
* In general the virtio block driver is allowed to try SCSI commands.
* It'd be nice if we supported eject, for example, but we don't.
......@@ -1680,6 +1671,13 @@ static void blk_request(struct virtqueue *vq)
/* Die, bad Guest, die. */
errx(1, "Write past end %llu+%u", off, ret);
}
wlen = sizeof(*in);
*in = (ret >= 0 ? VIRTIO_BLK_S_OK : VIRTIO_BLK_S_IOERR);
} else if (out->type & VIRTIO_BLK_T_FLUSH) {
/* Flush */
ret = fdatasync(vblk->fd);
verbose("FLUSH fdatasync: %i\n", ret);
wlen = sizeof(*in);
*in = (ret >= 0 ? VIRTIO_BLK_S_OK : VIRTIO_BLK_S_IOERR);
} else {
......@@ -1703,15 +1701,6 @@ static void blk_request(struct virtqueue *vq)
}
}
/*
* OK, so we noted that it was pretty poor to use an fdatasync as a
* barrier. But Christoph Hellwig points out that we need a sync
* *afterwards* as well: "Barriers specify no reordering to the front
* or the back." And Jens Axboe confirmed it, so here we are:
*/
if (out->type & VIRTIO_BLK_T_BARRIER)
fdatasync(vblk->fd);
/* Finished that request. */
add_used(vq, head, wlen);
}
......@@ -1736,8 +1725,8 @@ static void setup_block_file(const char *filename)
vblk->fd = open_or_die(filename, O_RDWR|O_LARGEFILE);
vblk->len = lseek64(vblk->fd, 0, SEEK_END);
/* We support barriers. */
add_feature(dev, VIRTIO_BLK_F_BARRIER);
/* We support FLUSH. */
add_feature(dev, VIRTIO_BLK_F_FLUSH);
/* Tell Guest how many sectors this device has. */
conf.capacity = cpu_to_le64(vblk->len / 512);
......
Kernel driver apds990x
======================
Supported chips:
Avago APDS990X
Data sheet:
Not freely available
Author:
Samu Onkalo <samu.p.onkalo@nokia.com>
Description
-----------
APDS990x is a combined ambient light and proximity sensor. ALS and proximity
functionality are highly connected. ALS measurement path must be running
while the proximity functionality is enabled.
ALS produces raw measurement values for two channels: Clear channel
(infrared + visible light) and IR only. However, threshold comparisons happen
using clear channel only. Lux value and the threshold level on the HW
might vary quite much depending the spectrum of the light source.
Driver makes necessary conversions to both directions so that user handles
only lux values. Lux value is calculated using information from the both
channels. HW threshold level is calculated from the given lux value to match
with current type of the lightning. Sometimes inaccuracy of the estimations
lead to false interrupt, but that doesn't harm.
ALS contains 4 different gain steps. Driver automatically
selects suitable gain step. After each measurement, reliability of the results
is estimated and new measurement is trigged if necessary.
Platform data can provide tuned values to the conversion formulas if
values are known. Otherwise plain sensor default values are used.
Proximity side is little bit simpler. There is no need for complex conversions.
It produces directly usable values.
Driver controls chip operational state using pm_runtime framework.
Voltage regulators are controlled based on chip operational state.
SYSFS
-----
chip_id
RO - shows detected chip type and version
power_state
RW - enable / disable chip. Uses counting logic
1 enables the chip
0 disables the chip
lux0_input
RO - measured lux value
sysfs_notify called when threshold interrupt occurs
lux0_sensor_range
RO - lux0_input max value. Actually never reaches since sensor tends
to saturate much before that. Real max value varies depending
on the light spectrum etc.
lux0_rate
RW - measurement rate in Hz
lux0_rate_avail
RO - supported measurement rates
lux0_calibscale
RW - calibration value. Set to neutral value by default.
Output results are multiplied with calibscale / calibscale_default
value.
lux0_calibscale_default
RO - neutral calibration value
lux0_thresh_above_value
RW - HI level threshold value. All results above the value
trigs an interrupt. 65535 (i.e. sensor_range) disables the above
interrupt.
lux0_thresh_below_value
RW - LO level threshold value. All results below the value
trigs an interrupt. 0 disables the below interrupt.
prox0_raw
RO - measured proximity value
sysfs_notify called when threshold interrupt occurs
prox0_sensor_range
RO - prox0_raw max value (1023)
prox0_raw_en
RW - enable / disable proximity - uses counting logic
1 enables the proximity
0 disables the proximity
prox0_reporting_mode
RW - trigger / periodic. In "trigger" mode the driver tells two possible
values: 0 or prox0_sensor_range value. 0 means no proximity,
1023 means proximity. This causes minimal number of interrupts.
In "periodic" mode the driver reports all values above
prox0_thresh_above. This causes more interrupts, but it can give
_rough_ estimate about the distance.
prox0_reporting_mode_avail
RO - accepted values to prox0_reporting_mode (trigger, periodic)
prox0_thresh_above_value
RW - threshold level which trigs proximity events.
Kernel driver bh1770glc
=======================
Supported chips:
ROHM BH1770GLC
OSRAM SFH7770
Data sheet:
Not freely available
Author:
Samu Onkalo <samu.p.onkalo@nokia.com>
Description
-----------
BH1770GLC and SFH7770 are combined ambient light and proximity sensors.
ALS and proximity parts operates on their own, but they shares common I2C
interface and interrupt logic. In principle they can run on their own,
but ALS side results are used to estimate reliability of the proximity sensor.
ALS produces 16 bit lux values. The chip contains interrupt logic to produce
low and high threshold interrupts.
Proximity part contains IR-led driver up to 3 IR leds. The chip measures
amount of reflected IR light and produces proximity result. Resolution is
8 bit. Driver supports only one channel. Driver uses ALS results to estimate
reliability of the proximity results. Thus ALS is always running while
proximity detection is needed.
Driver uses threshold interrupts to avoid need for polling the values.
Proximity low interrupt doesn't exists in the chip. This is simulated
by using a delayed work. As long as there is proximity threshold above
interrupts the delayed work is pushed forward. So, when proximity level goes
below the threshold value, there is no interrupt and the delayed work will
finally run. This is handled as no proximity indication.
Chip state is controlled via runtime pm framework when enabled in config.
Calibscale factor is used to hide differences between the chips. By default
value set to neutral state meaning factor of 1.00. To get proper values,
calibrated source of light is needed as a reference. Calibscale factor is set
so that measurement produces about the expected lux value.
SYSFS
-----
chip_id
RO - shows detected chip type and version
power_state
RW - enable / disable chip. Uses counting logic
1 enables the chip
0 disables the chip
lux0_input
RO - measured lux value
sysfs_notify called when threshold interrupt occurs
lux0_sensor_range
RO - lux0_input max value
lux0_rate
RW - measurement rate in Hz
lux0_rate_avail
RO - supported measurement rates
lux0_thresh_above_value
RW - HI level threshold value. All results above the value
trigs an interrupt. 65535 (i.e. sensor_range) disables the above
interrupt.
lux0_thresh_below_value
RW - LO level threshold value. All results below the value
trigs an interrupt. 0 disables the below interrupt.
lux0_calibscale
RW - calibration value. Set to neutral value by default.
Output results are multiplied with calibscale / calibscale_default
value.
lux0_calibscale_default
RO - neutral calibration value
prox0_raw
RO - measured proximity value
sysfs_notify called when threshold interrupt occurs
prox0_sensor_range
RO - prox0_raw max value
prox0_raw_en
RW - enable / disable proximity - uses counting logic
1 enables the proximity
0 disables the proximity
prox0_thresh_above_count
RW - number of proximity interrupts needed before triggering the event
prox0_rate_above
RW - Measurement rate (in Hz) when the level is above threshold
i.e. when proximity on has been reported.
prox0_rate_below
RW - Measurement rate (in Hz) when the level is below threshold
i.e. when proximity off has been reported.
prox0_rate_avail
RO - Supported proximity measurement rates in Hz
prox0_thresh_above0_value
RW - threshold level which trigs proximity events.
Filtered by persistence filter (prox0_thresh_above_count)
prox0_thresh_above1_value
RW - threshold level which trigs event immediately
......@@ -765,6 +765,14 @@ xmit_hash_policy
does not exist, and the layer2 policy is the only policy. The
layer2+3 value was added for bonding version 3.2.2.
resend_igmp
Specifies the number of IGMP membership reports to be issued after
a failover event. One membership report is issued immediately after
the failover, subsequent packets are sent in each 200ms interval.
The valid range is 0 - 255; the default value is 1. This option
was added for bonding version 3.7.0.
3. Configuring Bonding Devices
==============================
......
......@@ -22,6 +22,7 @@ This file contains
4.1.2 RAW socket option CAN_RAW_ERR_FILTER
4.1.3 RAW socket option CAN_RAW_LOOPBACK
4.1.4 RAW socket option CAN_RAW_RECV_OWN_MSGS
4.1.5 RAW socket returned message flags
4.2 Broadcast Manager protocol sockets (SOCK_DGRAM)
4.3 connected transport protocols (SOCK_SEQPACKET)
4.4 unconnected transport protocols (SOCK_DGRAM)
......@@ -471,6 +472,17 @@ solution for a couple of reasons:
setsockopt(s, SOL_CAN_RAW, CAN_RAW_RECV_OWN_MSGS,
&recv_own_msgs, sizeof(recv_own_msgs));
4.1.5 RAW socket returned message flags
When using recvmsg() call, the msg->msg_flags may contain following flags:
MSG_DONTROUTE: set when the received frame was created on the local host.
MSG_CONFIRM: set when the frame was sent via the socket it is received on.
This flag can be interpreted as a 'transmission confirmation' when the
CAN driver supports the echo of frames on driver level, see 3.2 and 6.2.
In order to receive such messages, CAN_RAW_RECV_OWN_MSGS must be set.
4.2 Broadcast Manager protocol sockets (SOCK_DGRAM)
4.3 connected transport protocols (SOCK_SEQPACKET)
4.4 unconnected transport protocols (SOCK_DGRAM)
......
DCCP protocol
============
=============
Contents
========
- Introduction
- Missing features
- Socket options
- Sysctl variables
- IOCTLs
- Other tunables
- Notes
Introduction
============
Datagram Congestion Control Protocol (DCCP) is an unreliable, connection
oriented protocol designed to solve issues present in UDP and TCP, particularly
for real-time and multimedia (streaming) traffic.
......@@ -29,9 +31,9 @@ It has a base protocol and pluggable congestion control IDs (CCIDs).
DCCP is a Proposed Standard (RFC 2026), and the homepage for DCCP as a protocol
is at http://www.ietf.org/html.charters/dccp-charter.html
Missing features
================
The Linux DCCP implementation does not currently support all the features that are
specified in RFCs 4340...42.
......@@ -45,7 +47,6 @@ http://linux-net.osdl.org/index.php/DCCP_Testing#Experimental_DCCP_source_tree
Socket options
==============
DCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of
service codes (RFC 4340, sec. 8.1.2); if this socket option is not set,
the socket will fall back to 0 (which means that no meaningful service code
......@@ -112,6 +113,7 @@ DCCP_SOCKOPT_CCID_TX_INFO
On unidirectional connections it is useful to close the unused half-connection
via shutdown (SHUT_WR or SHUT_RD): this will reduce per-packet processing costs.
Sysctl variables
================
Several DCCP default parameters can be managed by the following sysctls
......@@ -155,15 +157,30 @@ sync_ratelimit = 125 ms
sequence-invalid packets on the same socket (RFC 4340, 7.5.4). The unit
of this parameter is milliseconds; a value of 0 disables rate-limiting.
IOCTLS
======
FIONREAD
Works as in udp(7): returns in the `int' argument pointer the size of
the next pending datagram in bytes, or 0 when no datagram is pending.
Other tunables
==============
Per-route rto_min support
CCID-2 supports the RTAX_RTO_MIN per-route setting for the minimum value
of the RTO timer. This setting can be modified via the 'rto_min' option
of iproute2; for example:
> ip route change 10.0.0.0/24 rto_min 250j dev wlan0
> ip route add 10.0.0.254/32 rto_min 800j dev wlan0
> ip route show dev wlan0
CCID-3 also supports the rto_min setting: it is used to define the lower
bound for the expiry of the nofeedback timer. This can be useful on LANs
with very low RTTs (e.g., loopback, Gbit ethernet).
Notes
=====
DCCP does not travel through NAT successfully at present on many boxes. This is
because the checksum covers the pseudo-header as per TCP and UDP. Linux NAT
support for DCCP has been added.
......@@ -1014,6 +1014,12 @@ conf/interface/*:
accept_ra - BOOLEAN
Accept Router Advertisements; autoconfigure using them.
Possible values are:
0 Do not accept Router Advertisements.
1 Accept Router Advertisements if forwarding is disabled.
2 Overrule forwarding behaviour. Accept Router Advertisements
even if forwarding is enabled.
Functional default: enabled if local forwarding is disabled.
disabled if local forwarding is enabled.
......@@ -1075,7 +1081,12 @@ forwarding - BOOLEAN
Note: It is recommended to have the same setting on all
interfaces; mixed router/host scenarios are rather uncommon.
FALSE:
Possible values are:
0 Forwarding disabled
1 Forwarding enabled
2 Forwarding enabled (Hybrid Mode)
FALSE (0):
By default, Host behaviour is assumed. This means:
......@@ -1085,18 +1096,24 @@ forwarding - BOOLEAN
Advertisements (and do autoconfiguration).
4. If accept_redirects is TRUE (default), accept Redirects.
TRUE:
TRUE (1):
If local forwarding is enabled, Router behaviour is assumed.
This means exactly the reverse from the above:
1. IsRouter flag is set in Neighbour Advertisements.
2. Router Solicitations are not sent.
3. Router Advertisements are ignored.
3. Router Advertisements are ignored unless accept_ra is 2.
4. Redirects are ignored.
Default: FALSE if global forwarding is disabled (default),
otherwise TRUE.
TRUE (2):
Hybrid mode. Same behaviour as TRUE, except for:
2. Router Solicitations are being sent when necessary.
Default: 0 (disabled) if global forwarding is disabled (default),
otherwise 1 (enabled).
hop_limit - INTEGER
Default Hop Limit to set.
......
......@@ -112,6 +112,22 @@ However, connect() and getpeername() are not supported, as they did
not seem useful with Phonet usages (could be added easily).
Resource subscription
---------------------
A Phonet datagram socket can be subscribed to any number of 8-bits
Phonet resources, as follow:
uint32_t res = 0xXX;
ioctl(fd, SIOCPNADDRESOURCE, &res);
Subscription is similarly cancelled using the SIOCPNDELRESOURCE I/O
control request, or when the socket is closed.
Note that no more than one socket can be subcribed to any given
resource at a time. If not, ioctl() will return EBUSY.
Phonet Pipe protocol
--------------------
......@@ -166,6 +182,46 @@ The pipe protocol provides two socket options at the SOL_PNPIPE level:
or zero if encapsulation is off.
Phonet Pipe-controller Implementation
-------------------------------------
Phonet Pipe-controller is enabled by selecting the CONFIG_PHONET_PIPECTRLR Kconfig
option. It is useful when communicating with those Nokia Modems which do not
implement Pipe controller in them e.g. Nokia Slim Modem used in ST-Ericsson
U8500 platform.
The implementation is based on the Data Connection Establishment Sequence
depicted in 'Nokia Wireless Modem API - Wireless_modem_user_guide.pdf'
document.
It allows a phonet sequenced socket (host-pep) to initiate a Pipe connection
between itself and a remote pipe-end point (e.g. modem).
The implementation adds socket options at SOL_PNPIPE level:
PNPIPE_PIPE_HANDLE
It accepts an integer argument for setting value of pipe handle.
PNPIPE_ENABLE accepts one integer value (int). If set to zero, the pipe
is disabled. If the value is non-zero, the pipe is enabled. If the pipe
is not (yet) connected, ENOTCONN is error is returned.
The implementation also adds socket 'connect'. On calling the 'connect', pipe
will be created between the source socket and the destination, and the pipe
state will be set to PIPE_DISABLED.
After a pipe has been created and enabled successfully, the Pipe data can be
exchanged between the host-pep and remote-pep (modem).
User-space would typically follow below sequence with Pipe controller:-
-socket
-bind
-setsockopt for PNPIPE_PIPE_HANDLE
-connect
-setsockopt for PNPIPE_ENCAP_IP
-setsockopt for PNPIPE_ENABLE
Authors
-------
......
......@@ -177,18 +177,6 @@ Doing it all yourself
A convenience function to print out the PHY status neatly.
int phy_clear_interrupt(struct phy_device *phydev);
int phy_config_interrupt(struct phy_device *phydev, u32 interrupts);
Clear the PHY's interrupt, and configure which ones are allowed,
respectively. Currently only supports all on, or all off.
int phy_enable_interrupts(struct phy_device *phydev);
int phy_disable_interrupts(struct phy_device *phydev);
Functions which enable/disable PHY interrupts, clearing them
before and after, respectively.
int phy_start_interrupts(struct phy_device *phydev);
int phy_stop_interrupts(struct phy_device *phydev);
......@@ -213,12 +201,6 @@ Doing it all yourself
Fills the phydev structure with up-to-date information about the current
settings in the PHY.
void phy_sanitize_settings(struct phy_device *phydev)
Resolves differences between currently desired settings, and
supported settings for the given PHY device. Does not make
the changes in the hardware, though.
int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd);
int phy_ethtool_gset(struct phy_device *phydev, struct ethtool_cmd *cmd);
......
......@@ -172,15 +172,19 @@ struct skb_shared_hwtstamps {
};
Time stamps for outgoing packets are to be generated as follows:
- In hard_start_xmit(), check if skb_tx(skb)->hardware is set no-zero.
If yes, then the driver is expected to do hardware time stamping.
- In hard_start_xmit(), check if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)
is set no-zero. If yes, then the driver is expected to do hardware time
stamping.
- If this is possible for the skb and requested, then declare
that the driver is doing the time stamping by setting the field
skb_tx(skb)->in_progress non-zero. You might want to keep a pointer
to the associated skb for the next step and not free the skb. A driver
not supporting hardware time stamping doesn't do that. A driver must
never touch sk_buff::tstamp! It is used to store software generated
time stamps by the network subsystem.
that the driver is doing the time stamping by setting the flag
SKBTX_IN_PROGRESS in skb_shinfo(skb)->tx_flags , e.g. with
skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
You might want to keep a pointer to the associated skb for the next step
and not free the skb. A driver not supporting hardware time stamping doesn't
do that. A driver must never touch sk_buff::tstamp! It is used to store
software generated time stamps by the network subsystem.
- As soon as the driver has sent the packet and/or obtained a
hardware time stamp for it, it passes the time stamp back by
calling skb_hwtstamp_tx() with the original skb, the raw
......@@ -191,6 +195,6 @@ Time stamps for outgoing packets are to be generated as follows:
this would occur at a later time in the processing pipeline than other
software time stamping and therefore could lead to unexpected deltas
between time stamps.
- If the driver did not call set skb_tx(skb)->in_progress, then
- If the driver did not set the SKBTX_IN_PROGRESS flag (see above), then
dev_hard_start_xmit() checks whether software time stamping
is wanted as fallback and potentially generates the time stamp.
......@@ -8,6 +8,7 @@ and additions :
Required properties :
- compatible : Should be "fsl-usb2-mph" for multi port host USB
controllers, or "fsl-usb2-dr" for dual role USB controllers
or "fsl,mpc5121-usb2-dr" for dual role USB controllers of MPC5121
- phy_type : For multi port host USB controllers, should be one of
"ulpi", or "serial". For dual role USB controllers, should be
one of "ulpi", "utmi", "utmi_wide", or "serial".
......@@ -33,6 +34,12 @@ Recommended properties :
- interrupt-parent : the phandle for the interrupt controller that
services interrupts for this device.
Optional properties :
- fsl,invert-drvvbus : boolean; for MPC5121 USB0 only. Indicates the
port power polarity of internal PHY signal DRVVBUS is inverted.
- fsl,invert-pwr-fault : boolean; for MPC5121 USB0 only. Indicates
the PWR_FAULT signal polarity is inverted.
Example multi port host USB controller device node :
usb@22000 {
compatible = "fsl-usb2-mph";
......@@ -57,3 +64,18 @@ Example dual role USB controller device node :
dr_mode = "otg";
phy = "ulpi";
};
Example dual role USB controller device node for MPC5121ADS:
usb@4000 {
compatible = "fsl,mpc5121-usb2-dr";
reg = <0x4000 0x1000>;
#address-cells = <1>;
#size-cells = <0>;
interrupt-parent = < &ipic >;
interrupts = <44 0x8>;
dr_mode = "otg";
phy_type = "utmi_wide";
fsl,invert-drvvbus;
fsl,invert-pwr-fault;
};
......@@ -2,7 +2,7 @@ This file contains brief information about the SCSI tape driver.
The driver is currently maintained by Kai Mäkisara (email
Kai.Makisara@kolumbus.fi)
Last modified: Sun Feb 24 21:59:07 2008 by kai.makisara
Last modified: Sun Aug 29 18:25:47 2010 by kai.makisara
BASICS
......@@ -85,6 +85,17 @@ writing and the last operation has been a write. Two filemarks can be
optionally written. In both cases end of data is signified by
returning zero bytes for two consecutive reads.
Writing filemarks without the immediate bit set in the SCSI command block acts
as a synchronization point, i.e., all remaining data form the drive buffers is
written to tape before the command returns. This makes sure that write errors
are caught at that point, but this takes time. In some applications, several
consecutive files must be written fast. The MTWEOFI operation can be used to
write the filemarks without flushing the drive buffer. Writing filemark at
close() is always flushing the drive buffers. However, if the previous
operation is MTWEOFI, close() does not write a filemark. This can be used if
the program wants to close/open the tape device between files and wants to
skip waiting.
If rewind, offline, bsf, or seek is done and previous tape operation was
write, a filemark is written before moving tape.
......@@ -301,6 +312,8 @@ MTBSR Space backward over count records.
MTFSS Space forward over count setmarks.
MTBSS Space backward over count setmarks.
MTWEOF Write count filemarks.
MTWEOFI Write count filemarks with immediate bit set (i.e., does not
wait until data is on tape)
MTWSM Write count setmarks.
MTREW Rewind tape.
MTOFFL Set device off line (often rewind plus eject).
......
......@@ -300,6 +300,74 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
control correctly. If you have problems regarding this, try
another ALSA compliant mixer (alsamixer works).
Module snd-azt1605
------------------
Module for Aztech Sound Galaxy soundcards based on the Aztech AZT1605
chipset.
port - port # for BASE (0x220,0x240,0x260,0x280)
wss_port - port # for WSS (0x530,0x604,0xe80,0xf40)
irq - IRQ # for WSS (7,9,10,11)
dma1 - DMA # for WSS playback (0,1,3)
dma2 - DMA # for WSS capture (0,1), -1 = disabled (default)
mpu_port - port # for MPU-401 UART (0x300,0x330), -1 = disabled (default)
mpu_irq - IRQ # for MPU-401 UART (3,5,7,9), -1 = disabled (default)
fm_port - port # for OPL3 (0x388), -1 = disabled (default)
This module supports multiple cards. It does not support autoprobe: port,
wss_port, irq and dma1 have to be specified. The other values are
optional.
"port" needs to match the BASE ADDRESS jumper on the card (0x220 or 0x240)
or the value stored in the card's EEPROM for cards that have an EEPROM and
their "CONFIG MODE" jumper set to "EEPROM SETTING". The other values can
be choosen freely from the options enumerated above.
If dma2 is specified and different from dma1, the card will operate in
full-duplex mode. When dma1=3, only dma2=0 is valid and the only way to
enable capture since only channels 0 and 1 are available for capture.
Generic settings are "port=0x220 wss_port=0x530 irq=10 dma1=1 dma2=0
mpu_port=0x330 mpu_irq=9 fm_port=0x388".
Whatever IRQ and DMA channels you pick, be sure to reserve them for
legacy ISA in your BIOS.
Module snd-azt2316
------------------
Module for Aztech Sound Galaxy soundcards based on the Aztech AZT2316
chipset.
port - port # for BASE (0x220,0x240,0x260,0x280)
wss_port - port # for WSS (0x530,0x604,0xe80,0xf40)
irq - IRQ # for WSS (7,9,10,11)
dma1 - DMA # for WSS playback (0,1,3)
dma2 - DMA # for WSS capture (0,1), -1 = disabled (default)
mpu_port - port # for MPU-401 UART (0x300,0x330), -1 = disabled (default)
mpu_irq - IRQ # for MPU-401 UART (5,7,9,10), -1 = disabled (default)
fm_port - port # for OPL3 (0x388), -1 = disabled (default)
This module supports multiple cards. It does not support autoprobe: port,
wss_port, irq and dma1 have to be specified. The other values are
optional.
"port" needs to match the BASE ADDRESS jumper on the card (0x220 or 0x240)
or the value stored in the card's EEPROM for cards that have an EEPROM and
their "CONFIG MODE" jumper set to "EEPROM SETTING". The other values can
be choosen freely from the options enumerated above.
If dma2 is specified and different from dma1, the card will operate in
full-duplex mode. When dma1=3, only dma2=0 is valid and the only way to
enable capture since only channels 0 and 1 are available for capture.
Generic settings are "port=0x220 wss_port=0x530 irq=10 dma1=1 dma2=0
mpu_port=0x330 mpu_irq=9 fm_port=0x388".
Whatever IRQ and DMA channels you pick, be sure to reserve them for
legacy ISA in your BIOS.
Module snd-aw2
--------------
......@@ -1641,20 +1709,6 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed.
This card is also known as Audio Excel DSP 16 or Zoltrix AV302.
Module snd-sgalaxy
------------------
Module for Aztech Sound Galaxy sound card.
sbport - Port # for SB16 interface (0x220,0x240)
wssport - Port # for WSS interface (0x530,0xe80,0xf40,0x604)
irq - IRQ # (7,9,10,11)
dma1 - DMA #
This module supports multiple cards.
The power-management is supported.
Module snd-sscape
-----------------
......
......@@ -57,9 +57,11 @@ dead. However, this detection isn't perfect on some devices. In such
a case, you can change the default method via `position_fix` option.
`position_fix=1` means to use LPIB method explicitly.
`position_fix=2` means to use the position-buffer. 0 is the default
value, the automatic check and fallback to LPIB as described in the
above. If you get a problem of repeated sounds, this option might
`position_fix=2` means to use the position-buffer.
`position_fix=3` means to use a combination of both methods, needed
for some VIA and ATI controllers. 0 is the default value for all other
controllers, the automatic check and fallback to LPIB as described in
the above. If you get a problem of repeated sounds, this option might
help.
In addition to that, every controller is known to be broken regarding
......
......@@ -80,8 +80,10 @@ dirty_background_bytes
Contains the amount of dirty memory at which the pdflush background writeback
daemon will start writeback.
If dirty_background_bytes is written, dirty_background_ratio becomes a function
of its value (dirty_background_bytes / the amount of dirtyable system memory).
Note: dirty_background_bytes is the counterpart of dirty_background_ratio. Only
one of them may be specified at a time. When one sysctl is written it is
immediately taken into account to evaluate the dirty memory limits and the
other appears as 0 when read.
==============================================================
......@@ -97,8 +99,10 @@ dirty_bytes
Contains the amount of dirty memory at which a process generating disk writes
will itself start writeback.
If dirty_bytes is written, dirty_ratio becomes a function of its value
(dirty_bytes / the amount of dirtyable system memory).
Note: dirty_bytes is the counterpart of dirty_ratio. Only one of them may be
specified at a time. When one sysctl is written it is immediately taken into
account to evaluate the dirty memory limits and the other appears as 0 when
read.
Note: the minimum value allowed for dirty_bytes is two pages (in bytes); any
value lower than this limit will be ignored and the old configuration will be
......
......@@ -75,7 +75,7 @@ On all - write a character to /proc/sysrq-trigger. e.g.:
'f' - Will call oom_kill to kill a memory hog process.
'g' - Used by kgdb on ppc and sh platforms.
'g' - Used by kgdb (kernel debugger)
'h' - Will display help (actually any other key than those listed
here will display help. but 'h' is easy to remember :-)
......@@ -110,12 +110,15 @@ On all - write a character to /proc/sysrq-trigger. e.g.:
'u' - Will attempt to remount all mounted filesystems read-only.
'v' - Dumps Voyager SMP processor info to your console.
'v' - Forcefully restores framebuffer console
'v' - Causes ETM buffer dump [ARM-specific]
'w' - Dumps tasks that are in uninterruptable (blocked) state.
'x' - Used by xmon interface on ppc/powerpc platforms.
'y' - Show global CPU Registers [SPARC-64 specific]
'z' - Dump the ftrace buffer
'0'-'9' - Sets the console log level, controlling which kernel messages
......
......@@ -97,6 +97,33 @@ hpet_open_close(int argc, const char **argv)
void
hpet_info(int argc, const char **argv)
{
struct hpet_info info;
int fd;
if (argc != 1) {
fprintf(stderr, "hpet_info: device-name\n");
return;
}
fd = open(argv[0], O_RDONLY);
if (fd < 0) {
fprintf(stderr, "hpet_info: open of %s failed\n", argv[0]);
return;
}
if (ioctl(fd, HPET_INFO, &info) < 0) {
fprintf(stderr, "hpet_info: failed to get info\n");
goto out;
}
fprintf(stderr, "hpet_info: hi_irqfreq 0x%lx hi_flags 0x%lx ",
info.hi_ireqfreq, info.hi_flags);
fprintf(stderr, "hi_hpet %d hi_timer %d\n",
info.hi_hpet, info.hi_timer);
out:
close(fd);
return;
}
void
......
......@@ -46,7 +46,7 @@ use constant HIGH_KSWAPD_LATENCY => 20;
use constant HIGH_KSWAPD_REWAKEUP => 21;
use constant HIGH_NR_SCANNED => 22;
use constant HIGH_NR_TAKEN => 23;
use constant HIGH_NR_RECLAIM => 24;
use constant HIGH_NR_RECLAIMED => 24;
use constant HIGH_NR_CONTIG_DIRTY => 25;
my %perprocesspid;
......@@ -58,11 +58,13 @@ my $opt_read_procstat;
my $total_wakeup_kswapd;
my ($total_direct_reclaim, $total_direct_nr_scanned);
my ($total_direct_latency, $total_kswapd_latency);
my ($total_direct_nr_reclaimed);
my ($total_direct_writepage_file_sync, $total_direct_writepage_file_async);
my ($total_direct_writepage_anon_sync, $total_direct_writepage_anon_async);
my ($total_kswapd_nr_scanned, $total_kswapd_wake);
my ($total_kswapd_writepage_file_sync, $total_kswapd_writepage_file_async);
my ($total_kswapd_writepage_anon_sync, $total_kswapd_writepage_anon_async);
my ($total_kswapd_nr_reclaimed);
# Catch sigint and exit on request
my $sigint_report = 0;
......@@ -104,7 +106,7 @@ my $regex_kswapd_wake_default = 'nid=([0-9]*) order=([0-9]*)';
my $regex_kswapd_sleep_default = 'nid=([0-9]*)';
my $regex_wakeup_kswapd_default = 'nid=([0-9]*) zid=([0-9]*) order=([0-9]*)';
my $regex_lru_isolate_default = 'isolate_mode=([0-9]*) order=([0-9]*) nr_requested=([0-9]*) nr_scanned=([0-9]*) nr_taken=([0-9]*) contig_taken=([0-9]*) contig_dirty=([0-9]*) contig_failed=([0-9]*)';
my $regex_lru_shrink_inactive_default = 'lru=([A-Z_]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) priority=([0-9]*)';
my $regex_lru_shrink_inactive_default = 'nid=([0-9]*) zid=([0-9]*) nr_scanned=([0-9]*) nr_reclaimed=([0-9]*) priority=([0-9]*) flags=([A-Z_|]*)';
my $regex_lru_shrink_active_default = 'lru=([A-Z_]*) nr_scanned=([0-9]*) nr_rotated=([0-9]*) priority=([0-9]*)';
my $regex_writepage_default = 'page=([0-9a-f]*) pfn=([0-9]*) flags=([A-Z_|]*)';
......@@ -203,8 +205,8 @@ $regex_lru_shrink_inactive = generate_traceevent_regex(
"vmscan/mm_vmscan_lru_shrink_inactive",
$regex_lru_shrink_inactive_default,
"nid", "zid",
"lru",
"nr_scanned", "nr_reclaimed", "priority");
"nr_scanned", "nr_reclaimed", "priority",
"flags");
$regex_lru_shrink_active = generate_traceevent_regex(
"vmscan/mm_vmscan_lru_shrink_active",
$regex_lru_shrink_active_default,
......@@ -375,6 +377,16 @@ EVENT_PROCESS:
my $nr_contig_dirty = $7;
$perprocesspid{$process_pid}->{HIGH_NR_SCANNED} += $nr_scanned;
$perprocesspid{$process_pid}->{HIGH_NR_CONTIG_DIRTY} += $nr_contig_dirty;
} elsif ($tracepoint eq "mm_vmscan_lru_shrink_inactive") {
$details = $5;
if ($details !~ /$regex_lru_shrink_inactive/o) {
print "WARNING: Failed to parse mm_vmscan_lru_shrink_inactive as expected\n";
print " $details\n";
print " $regex_lru_shrink_inactive/o\n";
next;
}
my $nr_reclaimed = $4;
$perprocesspid{$process_pid}->{HIGH_NR_RECLAIMED} += $nr_reclaimed;
} elsif ($tracepoint eq "mm_vmscan_writepage") {
$details = $5;
if ($details !~ /$regex_writepage/o) {
......@@ -464,8 +476,8 @@ sub dump_stats {
# Print out process activity
printf("\n");
printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s %8s\n", "Process", "Direct", "Wokeup", "Pages", "Pages", "Pages", "Time");
printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s %8s\n", "details", "Rclms", "Kswapd", "Scanned", "Sync-IO", "ASync-IO", "Stalled");
printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s %8s %8s\n", "Process", "Direct", "Wokeup", "Pages", "Pages", "Pages", "Pages", "Time");
printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s %8s %8s\n", "details", "Rclms", "Kswapd", "Scanned", "Rclmed", "Sync-IO", "ASync-IO", "Stalled");
foreach $process_pid (keys %stats) {
if (!$stats{$process_pid}->{MM_VMSCAN_DIRECT_RECLAIM_BEGIN}) {
......@@ -475,6 +487,7 @@ sub dump_stats {
$total_direct_reclaim += $stats{$process_pid}->{MM_VMSCAN_DIRECT_RECLAIM_BEGIN};
$total_wakeup_kswapd += $stats{$process_pid}->{MM_VMSCAN_WAKEUP_KSWAPD};
$total_direct_nr_scanned += $stats{$process_pid}->{HIGH_NR_SCANNED};
$total_direct_nr_reclaimed += $stats{$process_pid}->{HIGH_NR_RECLAIMED};
$total_direct_writepage_file_sync += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC};
$total_direct_writepage_anon_sync += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC};
$total_direct_writepage_file_async += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC};
......@@ -489,11 +502,12 @@ sub dump_stats {
$index++;
}
printf("%-" . $max_strlen . "s %8d %10d %8u %8u %8u %8.3f",
printf("%-" . $max_strlen . "s %8d %10d %8u %8u %8u %8u %8.3f",
$process_pid,
$stats{$process_pid}->{MM_VMSCAN_DIRECT_RECLAIM_BEGIN},
$stats{$process_pid}->{MM_VMSCAN_WAKEUP_KSWAPD},
$stats{$process_pid}->{HIGH_NR_SCANNED},
$stats{$process_pid}->{HIGH_NR_RECLAIMED},
$stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC} + $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC},
$stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC} + $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_ASYNC},
$this_reclaim_delay / 1000);
......@@ -529,8 +543,8 @@ sub dump_stats {
# Print out kswapd activity
printf("\n");
printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s\n", "Kswapd", "Kswapd", "Order", "Pages", "Pages", "Pages");
printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s\n", "Instance", "Wakeups", "Re-wakeup", "Scanned", "Sync-IO", "ASync-IO");
printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s\n", "Kswapd", "Kswapd", "Order", "Pages", "Pages", "Pages", "Pages");
printf("%-" . $max_strlen . "s %8s %10s %8s %8s %8s %8s\n", "Instance", "Wakeups", "Re-wakeup", "Scanned", "Rclmed", "Sync-IO", "ASync-IO");
foreach $process_pid (keys %stats) {
if (!$stats{$process_pid}->{MM_VMSCAN_KSWAPD_WAKE}) {
......@@ -539,16 +553,18 @@ sub dump_stats {
$total_kswapd_wake += $stats{$process_pid}->{MM_VMSCAN_KSWAPD_WAKE};
$total_kswapd_nr_scanned += $stats{$process_pid}->{HIGH_NR_SCANNED};
$total_kswapd_nr_reclaimed += $stats{$process_pid}->{HIGH_NR_RECLAIMED};
$total_kswapd_writepage_file_sync += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC};
$total_kswapd_writepage_anon_sync += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC};
$total_kswapd_writepage_file_async += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC};
$total_kswapd_writepage_anon_async += $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_ASYNC};
printf("%-" . $max_strlen . "s %8d %10d %8u %8i %8u",
printf("%-" . $max_strlen . "s %8d %10d %8u %8u %8i %8u",
$process_pid,
$stats{$process_pid}->{MM_VMSCAN_KSWAPD_WAKE},
$stats{$process_pid}->{HIGH_KSWAPD_REWAKEUP},
$stats{$process_pid}->{HIGH_NR_SCANNED},
$stats{$process_pid}->{HIGH_NR_RECLAIMED},
$stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC} + $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC},
$stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC} + $stats{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_ASYNC});
......@@ -579,6 +595,7 @@ sub dump_stats {
print "\nSummary\n";
print "Direct reclaims: $total_direct_reclaim\n";
print "Direct reclaim pages scanned: $total_direct_nr_scanned\n";
print "Direct reclaim pages reclaimed: $total_direct_nr_reclaimed\n";
print "Direct reclaim write file sync I/O: $total_direct_writepage_file_sync\n";
print "Direct reclaim write anon sync I/O: $total_direct_writepage_anon_sync\n";
print "Direct reclaim write file async I/O: $total_direct_writepage_file_async\n";
......@@ -588,6 +605,7 @@ sub dump_stats {
print "\n";
print "Kswapd wakeups: $total_kswapd_wake\n";
print "Kswapd pages scanned: $total_kswapd_nr_scanned\n";
print "Kswapd pages reclaimed: $total_kswapd_nr_reclaimed\n";
print "Kswapd reclaim write file sync I/O: $total_kswapd_writepage_file_sync\n";
print "Kswapd reclaim write anon sync I/O: $total_kswapd_writepage_anon_sync\n";
print "Kswapd reclaim write file async I/O: $total_kswapd_writepage_file_async\n";
......@@ -612,6 +630,7 @@ sub aggregate_perprocesspid() {
$perprocess{$process}->{MM_VMSCAN_WAKEUP_KSWAPD} += $perprocesspid{$process_pid}->{MM_VMSCAN_WAKEUP_KSWAPD};
$perprocess{$process}->{HIGH_KSWAPD_REWAKEUP} += $perprocesspid{$process_pid}->{HIGH_KSWAPD_REWAKEUP};
$perprocess{$process}->{HIGH_NR_SCANNED} += $perprocesspid{$process_pid}->{HIGH_NR_SCANNED};
$perprocess{$process}->{HIGH_NR_RECLAIMED} += $perprocesspid{$process_pid}->{HIGH_NR_RECLAIMED};
$perprocess{$process}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC} += $perprocesspid{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_SYNC};
$perprocess{$process}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC} += $perprocesspid{$process_pid}->{MM_VMSCAN_WRITEPAGE_ANON_SYNC};
$perprocess{$process}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC} += $perprocesspid{$process_pid}->{MM_VMSCAN_WRITEPAGE_FILE_ASYNC};
......
/proc/bus/usb filesystem output
===============================
(version 2003.05.30)
(version 2010.09.13)
The usbfs filesystem for USB devices is traditionally mounted at
/proc/bus/usb. It provides the /proc/bus/usb/devices file, as well as
the /proc/bus/usb/BBB/DDD files.
In many modern systems the usbfs filsystem isn't used at all. Instead
USB device nodes are created under /dev/usb/ or someplace similar. The
"devices" file is available in debugfs, typically as
/sys/kernel/debug/usb/devices.
**NOTE**: If /proc/bus/usb appears empty, and a host controller
driver has been linked, then you need to mount the
......@@ -106,8 +111,8 @@ Legend:
Topology info:
T: Bus=dd Lev=dd Prnt=dd Port=dd Cnt=dd Dev#=ddd Spd=ddd MxCh=dd
| | | | | | | | |__MaxChildren
T: Bus=dd Lev=dd Prnt=dd Port=dd Cnt=dd Dev#=ddd Spd=dddd MxCh=dd
| | | | | | | | |__MaxChildren
| | | | | | | |__Device Speed in Mbps
| | | | | | |__DeviceNumber
| | | | | |__Count of devices at this level
......@@ -120,8 +125,13 @@ T: Bus=dd Lev=dd Prnt=dd Port=dd Cnt=dd Dev#=ddd Spd=ddd MxCh=dd
Speed may be:
1.5 Mbit/s for low speed USB
12 Mbit/s for full speed USB
480 Mbit/s for high speed USB (added for USB 2.0)
480 Mbit/s for high speed USB (added for USB 2.0);
also used for Wireless USB, which has no fixed speed
5000 Mbit/s for SuperSpeed USB (added for USB 3.0)
For reasons lost in the mists of time, the Port number is always
too low by 1. For example, a device plugged into port 4 will
show up with "Port=03".
Bandwidth info:
B: Alloc=ddd/ddd us (xx%), #Int=ddd, #Iso=ddd
......@@ -291,7 +301,7 @@ Here's an example, from a system which has a UHCI root hub,
an external hub connected to the root hub, and a mouse and
a serial converter connected to the external hub.
T: Bus=00 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2
T: Bus=00 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2
B: Alloc= 28/900 us ( 3%), #Int= 2, #Iso= 0
D: Ver= 1.00 Cls=09(hub ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=0000 ProdID=0000 Rev= 0.00
......@@ -301,21 +311,21 @@ C:* #Ifs= 1 Cfg#= 1 Atr=40 MxPwr= 0mA
I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub
E: Ad=81(I) Atr=03(Int.) MxPS= 8 Ivl=255ms
T: Bus=00 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 4
T: Bus=00 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 4
D: Ver= 1.00 Cls=09(hub ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=0451 ProdID=1446 Rev= 1.00
C:* #Ifs= 1 Cfg#= 1 Atr=e0 MxPwr=100mA
I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub
E: Ad=81(I) Atr=03(Int.) MxPS= 1 Ivl=255ms
T: Bus=00 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=1.5 MxCh= 0
T: Bus=00 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=1.5 MxCh= 0
D: Ver= 1.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=04b4 ProdID=0001 Rev= 0.00
C:* #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=100mA
I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=mouse
E: Ad=81(I) Atr=03(Int.) MxPS= 3 Ivl= 10ms
T: Bus=00 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 4 Spd=12 MxCh= 0
T: Bus=00 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 4 Spd=12 MxCh= 0
D: Ver= 1.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 8 #Cfgs= 1
P: Vendor=0565 ProdID=0001 Rev= 1.08
S: Manufacturer=Peracom Networks, Inc.
......@@ -330,12 +340,12 @@ E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl= 8ms
Selecting only the "T:" and "I:" lines from this (for example, by using
"procusb ti"), we have:
T: Bus=00 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2
T: Bus=00 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 4
T: Bus=00 Lev=00 Prnt=00 Port=00 Cnt=00 Dev#= 1 Spd=12 MxCh= 2
T: Bus=00 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 4
I: If#= 0 Alt= 0 #EPs= 1 Cls=09(hub ) Sub=00 Prot=00 Driver=hub
T: Bus=00 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=1.5 MxCh= 0
T: Bus=00 Lev=02 Prnt=02 Port=00 Cnt=01 Dev#= 3 Spd=1.5 MxCh= 0
I: If#= 0 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=01 Prot=02 Driver=mouse
T: Bus=00 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 4 Spd=12 MxCh= 0
T: Bus=00 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 4 Spd=12 MxCh= 0
I: If#= 0 Alt= 0 #EPs= 3 Cls=00(>ifc ) Sub=00 Prot=00 Driver=serial
......
====================
HIGH MEMORY HANDLING
====================
By: Peter Zijlstra <a.p.zijlstra@chello.nl>
Contents:
(*) What is high memory?
(*) Temporary virtual mappings.
(*) Using kmap_atomic.
(*) Cost of temporary mappings.
(*) i386 PAE.
====================
WHAT IS HIGH MEMORY?
====================
High memory (highmem) is used when the size of physical memory approaches or
exceeds the maximum size of virtual memory. At that point it becomes
impossible for the kernel to keep all of the available physical memory mapped
at all times. This means the kernel needs to start using temporary mappings of
the pieces of physical memory that it wants to access.
The part of (physical) memory not covered by a permanent mapping is what we
refer to as 'highmem'. There are various architecture dependent constraints on
where exactly that border lies.
In the i386 arch, for example, we choose to map the kernel into every process's
VM space so that we don't have to pay the full TLB invalidation costs for
kernel entry/exit. This means the available virtual memory space (4GiB on
i386) has to be divided between user and kernel space.
The traditional split for architectures using this approach is 3:1, 3GiB for
userspace and the top 1GiB for kernel space:
+--------+ 0xffffffff
| Kernel |
+--------+ 0xc0000000
| |
| User |
| |
+--------+ 0x00000000
This means that the kernel can at most map 1GiB of physical memory at any one
time, but because we need virtual address space for other things - including
temporary maps to access the rest of the physical memory - the actual direct
map will typically be less (usually around ~896MiB).
Other architectures that have mm context tagged TLBs can have separate kernel
and user maps. Some hardware (like some ARMs), however, have limited virtual
space when they use mm context tags.
==========================
TEMPORARY VIRTUAL MAPPINGS
==========================
The kernel contains several ways of creating temporary mappings:
(*) vmap(). This can be used to make a long duration mapping of multiple
physical pages into a contiguous virtual space. It needs global
synchronization to unmap.
(*) kmap(). This permits a short duration mapping of a single page. It needs
global synchronization, but is amortized somewhat. It is also prone to
deadlocks when using in a nested fashion, and so it is not recommended for
new code.
(*) kmap_atomic(). This permits a very short duration mapping of a single
page. Since the mapping is restricted to the CPU that issued it, it
performs well, but the issuing task is therefore required to stay on that
CPU until it has finished, lest some other task displace its mappings.
kmap_atomic() may also be used by interrupt contexts, since it is does not
sleep and the caller may not sleep until after kunmap_atomic() is called.
It may be assumed that k[un]map_atomic() won't fail.
=================
USING KMAP_ATOMIC
=================
When and where to use kmap_atomic() is straightforward. It is used when code
wants to access the contents of a page that might be allocated from high memory
(see __GFP_HIGHMEM), for example a page in the pagecache. The API has two
functions, and they can be used in a manner similar to the following:
/* Find the page of interest. */
struct page *page = find_get_page(mapping, offset);
/* Gain access to the contents of that page. */
void *vaddr = kmap_atomic(page);
/* Do something to the contents of that page. */
memset(vaddr, 0, PAGE_SIZE);
/* Unmap that page. */
kunmap_atomic(vaddr);
Note that the kunmap_atomic() call takes the result of the kmap_atomic() call
not the argument.
If you need to map two pages because you want to copy from one page to
another you need to keep the kmap_atomic calls strictly nested, like:
vaddr1 = kmap_atomic(page1);
vaddr2 = kmap_atomic(page2);
memcpy(vaddr1, vaddr2, PAGE_SIZE);
kunmap_atomic(vaddr2);
kunmap_atomic(vaddr1);
==========================
COST OF TEMPORARY MAPPINGS
==========================
The cost of creating temporary mappings can be quite high. The arch has to
manipulate the kernel's page tables, the data TLB and/or the MMU's registers.
If CONFIG_HIGHMEM is not set, then the kernel will try and create a mapping
simply with a bit of arithmetic that will convert the page struct address into
a pointer to the page contents rather than juggling mappings about. In such a
case, the unmap operation may be a null operation.
If CONFIG_MMU is not set, then there can be no temporary mappings and no
highmem. In such a case, the arithmetic approach will also be used.
========
i386 PAE
========
The i386 arch, under some circumstances, will permit you to stick up to 64GiB
of RAM into your 32-bit machine. This has a number of consequences:
(*) Linux needs a page-frame structure for each page in the system and the
pageframes need to live in the permanent mapping, which means:
(*) you can have 896M/sizeof(struct page) page-frames at most; with struct
page being 32-bytes that would end up being something in the order of 112G
worth of pages; the kernel, however, needs to store more than just
page-frames in that memory...
(*) PAE makes your page tables larger - which slows the system down as more
data has to be accessed to traverse in TLB fills and the like. One
advantage is that PAE has more PTE bits and can provide advanced features
like NX and PAT.
The general recommendation is that you don't use more than 8GiB on a 32-bit
machine - although more might work for you and your workload, you're pretty
much on your own - don't expect kernel developers to really care much if things
come apart.
......@@ -424,7 +424,7 @@ a command line tool, numactl(8), exists that allows one to:
+ set the shared policy for a shared memory segment via mbind(2)
The numactl(8) tool is packages with the run-time version of the library
The numactl(8) tool is packaged with the run-time version of the library
containing the memory policy system call wrappers. Some distributions
package the headers and compile-time libraries in a separate development
package.
......
......@@ -196,11 +196,11 @@ resources, scheduled and executed.
suspend operations. Work items on the wq are drained and no
new work item starts execution until thawed.
WQ_RESCUER
WQ_MEM_RECLAIM
All wq which might be used in the memory reclaim paths _MUST_
have this flag set. This reserves one worker exclusively for
the execution of this wq under memory pressure.
have this flag set. The wq is guaranteed to have at least one
execution context regardless of memory pressure.
WQ_HIGHPRI
......@@ -356,11 +356,11 @@ If q1 has WQ_CPU_INTENSIVE set,
6. Guidelines
* Do not forget to use WQ_RESCUER if a wq may process work items which
are used during memory reclaim. Each wq with WQ_RESCUER set has one
rescuer thread reserved for it. If there is dependency among
multiple work items used during memory reclaim, they should be
queued to separate wq each with WQ_RESCUER.
* Do not forget to use WQ_MEM_RECLAIM if a wq may process work items
which are used during memory reclaim. Each wq with WQ_MEM_RECLAIM
set has an execution context reserved for it. If there is
dependency among multiple work items used during memory reclaim,
they should be queued to separate wq each with WQ_MEM_RECLAIM.
* Unless strict ordering is required, there is no need to use ST wq.
......@@ -368,12 +368,13 @@ If q1 has WQ_CPU_INTENSIVE set,
recommended. In most use cases, concurrency level usually stays
well under the default limit.
* A wq serves as a domain for forward progress guarantee (WQ_RESCUER),
flush and work item attributes. Work items which are not involved
in memory reclaim and don't need to be flushed as a part of a group
of work items, and don't require any special attribute, can use one
of the system wq. There is no difference in execution
characteristics between using a dedicated wq and a system wq.
* A wq serves as a domain for forward progress guarantee
(WQ_MEM_RECLAIM, flush and work item attributes. Work items which
are not involved in memory reclaim and don't need to be flushed as a
part of a group of work items, and don't require any special
attribute, can use one of the system wq. There is no difference in
execution characteristics between using a dedicated wq and a system
wq.
* Unless work items are expected to consume a huge amount of CPU
cycles, using a bound wq is usually beneficial due to the increased
......
......@@ -18,9 +18,9 @@ specialized stacks contain no useful data. The main CPU stacks are:
Used for external hardware interrupts. If this is the first external
hardware interrupt (i.e. not a nested hardware interrupt) then the
kernel switches from the current task to the interrupt stack. Like
the split thread and interrupt stacks on i386 (with CONFIG_4KSTACKS),
this gives more room for kernel interrupt processing without having
to increase the size of every per thread stack.
the split thread and interrupt stacks on i386, this gives more room
for kernel interrupt processing without having to increase the size
of every per thread stack.
The interrupt stack is also used when processing a softirq.
......
......@@ -53,6 +53,7 @@ targets += arch/$(SRCARCH)/kernel/asm-offsets.s
# Default sed regexp - multiline due to syntax constraints
define sed-y
"/^->/{s:->#\(.*\):/* \1 */:; \
s:^->\([^ ]*\) [\$$#]*\([-0-9]*\) \(.*\):#define \1 \2 /* \3 */:; \
s:^->\([^ ]*\) [\$$#]*\([^ ]*\) \(.*\):#define \1 \2 /* \3 */:; \
s:->::; p;}"
endef
......
此差异已折叠。
......@@ -42,6 +42,20 @@ config KPROBES
for kernel debugging, non-intrusive instrumentation and testing.
If in doubt, say "N".
config JUMP_LABEL
bool "Optimize trace point call sites"
depends on HAVE_ARCH_JUMP_LABEL
help
If it is detected that the compiler has support for "asm goto",
the kernel will compile trace point locations with just a
nop instruction. When trace points are enabled, the nop will
be converted to a jump to the trace function. This technique
lowers overhead and stress on the branch prediction of the
processor.
On i386, options added to the compiler flags may increase
the size of the kernel slightly.
config OPTPROBES
def_bool y
depends on KPROBES && HAVE_OPTPROBES
......
......@@ -55,6 +55,9 @@ config ZONE_DMA
bool
default y
config ARCH_DMA_ADDR_T_64BIT
def_bool y
config NEED_DMA_MAP_STATE
def_bool y
......
......@@ -247,7 +247,7 @@ struct el_MCPCIA_uncorrected_frame_mcheck {
#define vip volatile int __force *
#define vuip volatile unsigned int __force *
#ifdef MCPCIA_ONE_HAE_WINDOW
#ifndef MCPCIA_ONE_HAE_WINDOW
#define MCPCIA_FROB_MMIO \
if (__mcpcia_is_mmio(hose)) { \
set_hae(hose & 0xffffffff); \
......
#ifndef __ALPHA_T2__H__
#define __ALPHA_T2__H__
/* Fit everything into one 128MB HAE window. */
#define T2_ONE_HAE_WINDOW 1
#include <linux/types.h>
#include <linux/spinlock.h>
#include <asm/compiler.h>
......@@ -19,7 +22,7 @@
*
*/
#define T2_MEM_R1_MASK 0x07ffffff /* Mem sparse region 1 mask is 26 bits */
#define T2_MEM_R1_MASK 0x07ffffff /* Mem sparse region 1 mask is 27 bits */
/* GAMMA-SABLE is a SABLE with EV5-based CPUs */
/* All LYNX machines, EV4 or EV5, use the GAMMA bias also */
......@@ -85,7 +88,9 @@
#define T2_DIR (IDENT_ADDR + GAMMA_BIAS + 0x38e0004a0UL)
#define T2_ICE (IDENT_ADDR + GAMMA_BIAS + 0x38e0004c0UL)
#ifndef T2_ONE_HAE_WINDOW
#define T2_HAE_ADDRESS T2_HAE_1
#endif
/* T2 CSRs are in the non-cachable primary IO space from 3.8000.0000 to
3.8fff.ffff
......@@ -429,13 +434,15 @@ extern inline void t2_outl(u32 b, unsigned long addr)
*
*/
#ifdef T2_ONE_HAE_WINDOW
#define t2_set_hae
#else
#define t2_set_hae { \
msb = addr >> 27; \
unsigned long msb = addr >> 27; \
addr &= T2_MEM_R1_MASK; \
set_hae(msb); \
}
extern raw_spinlock_t t2_hae_lock;
#endif
/*
* NOTE: take T2_DENSE_MEM off in each readX/writeX routine, since
......@@ -446,28 +453,22 @@ extern raw_spinlock_t t2_hae_lock;
__EXTERN_INLINE u8 t2_readb(const volatile void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr - T2_DENSE_MEM;
unsigned long result, msb;
unsigned long flags;
raw_spin_lock_irqsave(&t2_hae_lock, flags);
unsigned long result;
t2_set_hae;
result = *(vip) ((addr << 5) + T2_SPARSE_MEM + 0x00);
raw_spin_unlock_irqrestore(&t2_hae_lock, flags);
return __kernel_extbl(result, addr & 3);
}
__EXTERN_INLINE u16 t2_readw(const volatile void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr - T2_DENSE_MEM;
unsigned long result, msb;
unsigned long flags;
raw_spin_lock_irqsave(&t2_hae_lock, flags);
unsigned long result;
t2_set_hae;
result = *(vuip) ((addr << 5) + T2_SPARSE_MEM + 0x08);
raw_spin_unlock_irqrestore(&t2_hae_lock, flags);
return __kernel_extwl(result, addr & 3);
}
......@@ -478,59 +479,47 @@ __EXTERN_INLINE u16 t2_readw(const volatile void __iomem *xaddr)
__EXTERN_INLINE u32 t2_readl(const volatile void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr - T2_DENSE_MEM;
unsigned long result, msb;
unsigned long flags;
raw_spin_lock_irqsave(&t2_hae_lock, flags);
unsigned long result;
t2_set_hae;
result = *(vuip) ((addr << 5) + T2_SPARSE_MEM + 0x18);
raw_spin_unlock_irqrestore(&t2_hae_lock, flags);
return result & 0xffffffffUL;
}
__EXTERN_INLINE u64 t2_readq(const volatile void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr - T2_DENSE_MEM;
unsigned long r0, r1, work, msb;
unsigned long flags;
raw_spin_lock_irqsave(&t2_hae_lock, flags);
unsigned long r0, r1, work;
t2_set_hae;
work = (addr << 5) + T2_SPARSE_MEM + 0x18;
r0 = *(vuip)(work);
r1 = *(vuip)(work + (4 << 5));
raw_spin_unlock_irqrestore(&t2_hae_lock, flags);
return r1 << 32 | r0;
}
__EXTERN_INLINE void t2_writeb(u8 b, volatile void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr - T2_DENSE_MEM;
unsigned long msb, w;
unsigned long flags;
raw_spin_lock_irqsave(&t2_hae_lock, flags);
unsigned long w;
t2_set_hae;
w = __kernel_insbl(b, addr & 3);
*(vuip) ((addr << 5) + T2_SPARSE_MEM + 0x00) = w;
raw_spin_unlock_irqrestore(&t2_hae_lock, flags);
}
__EXTERN_INLINE void t2_writew(u16 b, volatile void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr - T2_DENSE_MEM;
unsigned long msb, w;
unsigned long flags;
raw_spin_lock_irqsave(&t2_hae_lock, flags);
unsigned long w;
t2_set_hae;
w = __kernel_inswl(b, addr & 3);
*(vuip) ((addr << 5) + T2_SPARSE_MEM + 0x08) = w;
raw_spin_unlock_irqrestore(&t2_hae_lock, flags);
}
/*
......@@ -540,29 +529,22 @@ __EXTERN_INLINE void t2_writew(u16 b, volatile void __iomem *xaddr)
__EXTERN_INLINE void t2_writel(u32 b, volatile void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr - T2_DENSE_MEM;
unsigned long msb;
unsigned long flags;
raw_spin_lock_irqsave(&t2_hae_lock, flags);
t2_set_hae;
*(vuip) ((addr << 5) + T2_SPARSE_MEM + 0x18) = b;
raw_spin_unlock_irqrestore(&t2_hae_lock, flags);
}
__EXTERN_INLINE void t2_writeq(u64 b, volatile void __iomem *xaddr)
{
unsigned long addr = (unsigned long) xaddr - T2_DENSE_MEM;
unsigned long msb, work;
unsigned long flags;
raw_spin_lock_irqsave(&t2_hae_lock, flags);
unsigned long work;
t2_set_hae;
work = (addr << 5) + T2_SPARSE_MEM + 0x18;
*(vuip)work = b;
*(vuip)(work + (4 << 5)) = b >> 32;
raw_spin_unlock_irqrestore(&t2_hae_lock, flags);
}
__EXTERN_INLINE void __iomem *t2_ioportmap(unsigned long addr)
......
......@@ -318,9 +318,7 @@ extern inline pte_t * pte_offset_kernel(pmd_t * dir, unsigned long address)
}
#define pte_offset_map(dir,addr) pte_offset_kernel((dir),(addr))
#define pte_offset_map_nested(dir,addr) pte_offset_kernel((dir),(addr))
#define pte_unmap(pte) do { } while (0)
#define pte_unmap_nested(pte) do { } while (0)
extern pgd_t swapper_pg_dir[1024];
......
......@@ -74,8 +74,6 @@
# define DBG(args)
#endif
DEFINE_RAW_SPINLOCK(t2_hae_lock);
static volatile unsigned int t2_mcheck_any_expected;
static volatile unsigned int t2_mcheck_last_taken;
......@@ -406,6 +404,7 @@ void __init
t2_init_arch(void)
{
struct pci_controller *hose;
struct resource *hae_mem;
unsigned long temp;
unsigned int i;
......@@ -433,7 +432,13 @@ t2_init_arch(void)
*/
pci_isa_hose = hose = alloc_pci_controller();
hose->io_space = &ioport_resource;
hose->mem_space = &iomem_resource;
hae_mem = alloc_resource();
hae_mem->start = 0;
hae_mem->end = T2_MEM_R1_MASK;
hae_mem->name = pci_hae0_name;
if (request_resource(&iomem_resource, hae_mem) < 0)
printk(KERN_ERR "Failed to request HAE_MEM\n");
hose->mem_space = hae_mem;
hose->index = 0;
hose->sparse_mem_base = T2_SPARSE_MEM - IDENT_ADDR;
......
......@@ -25,6 +25,9 @@
#ifdef MCPCIA_ONE_HAE_WINDOW
#define MCPCIA_HAE_ADDRESS (&alpha_mv.hae_cache)
#endif
#ifdef T2_ONE_HAE_WINDOW
#define T2_HAE_ADDRESS (&alpha_mv.hae_cache)
#endif
/* Only a few systems don't define IACK_SC, handling all interrupts through
the SRM console. But splitting out that one case from IO() below
......
......@@ -223,7 +223,7 @@ iommu_arena_free(struct pci_iommu_arena *arena, long ofs, long n)
*/
static int pci_dac_dma_supported(struct pci_dev *dev, u64 mask)
{
dma64_addr_t dac_offset = alpha_mv.pci_dac_offset;
dma_addr_t dac_offset = alpha_mv.pci_dac_offset;
int ok = 1;
/* If this is not set, the machine doesn't support DAC at all. */
......@@ -756,7 +756,7 @@ static void alpha_pci_unmap_sg(struct device *dev, struct scatterlist *sg,
spin_lock_irqsave(&arena->lock, flags);
for (end = sg + nents; sg < end; ++sg) {
dma64_addr_t addr;
dma_addr_t addr;
size_t size;
long npages, ofs;
dma_addr_t tend;
......
......@@ -269,7 +269,8 @@ void ptrace_disable(struct task_struct *child)
user_disable_single_step(child);
}
long arch_ptrace(struct task_struct *child, long request, long addr, long data)
long arch_ptrace(struct task_struct *child, long request,
unsigned long addr, unsigned long data)
{
unsigned long tmp;
size_t copied;
......@@ -292,7 +293,7 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)
case PTRACE_PEEKUSR:
force_successful_syscall_return();
ret = get_reg(child, addr);
DBG(DBG_MEM, ("peek $%ld->%#lx\n", addr, ret));
DBG(DBG_MEM, ("peek $%lu->%#lx\n", addr, ret));
break;
/* When I and D space are separate, this will have to be fixed. */
......@@ -302,7 +303,7 @@ long arch_ptrace(struct task_struct *child, long request, long addr, long data)
break;
case PTRACE_POKEUSR: /* write the specified register */
DBG(DBG_MEM, ("poke $%ld<-%#lx\n", addr, data));
DBG(DBG_MEM, ("poke $%lu<-%#lx\n", addr, data));
ret = put_reg(child, addr, data);
break;
default:
......
......@@ -573,6 +573,7 @@ config ARCH_TEGRA
select HAVE_CLK
select COMMON_CLKDEV
select ARCH_HAS_BARRIERS if CACHE_L2X0
select ARCH_HAS_CPUFREQ
help
This enables support for NVIDIA Tegra based systems (Tegra APX,
Tegra 6xx and Tegra 2 series).
......@@ -831,7 +832,7 @@ config ARCH_OMAP
select GENERIC_CLOCKEVENTS
select ARCH_HAS_HOLES_MEMORYMODEL
help
Support for TI's OMAP platform (OMAP1 and OMAP2).
Support for TI's OMAP platform (OMAP1/2/3/4).
config PLAT_SPEAR
bool "ST SPEAr"
......@@ -1222,7 +1223,7 @@ config SMP
See also <file:Documentation/i386/IO-APIC.txt>,
<file:Documentation/nmi_watchdog.txt> and the SMP-HOWTO available at
<http://www.linuxdoc.org/docs.html#howto>.
<http://tldp.org/HOWTO/SMP-HOWTO.html>.
If you don't know what to do here, say N.
......
......@@ -8,7 +8,7 @@
* published by the Free Software Foundation.
*
* Support functions for calculating clocks/divisors for the ICST307
* clock generators. See http://www.icst.com/ for more information
* clock generators. See http://www.idt.com/ for more information
* on these devices.
*
* This is an almost identical implementation to the ICST525 clock generator.
......
......@@ -44,12 +44,12 @@ void reset_scoop(struct device *dev)
{
struct scoop_dev *sdev = dev_get_drvdata(dev);
iowrite16(0x0100, sdev->base + SCOOP_MCR); // 00
iowrite16(0x0000, sdev->base + SCOOP_CDR); // 04
iowrite16(0x0000, sdev->base + SCOOP_CCR); // 10
iowrite16(0x0000, sdev->base + SCOOP_IMR); // 18
iowrite16(0x00FF, sdev->base + SCOOP_IRM); // 14
iowrite16(0x0000, sdev->base + SCOOP_ISR); // 1C
iowrite16(0x0100, sdev->base + SCOOP_MCR); /* 00 */
iowrite16(0x0000, sdev->base + SCOOP_CDR); /* 04 */
iowrite16(0x0000, sdev->base + SCOOP_CCR); /* 10 */
iowrite16(0x0000, sdev->base + SCOOP_IMR); /* 18 */
iowrite16(0x00FF, sdev->base + SCOOP_IRM); /* 14 */
iowrite16(0x0000, sdev->base + SCOOP_ISR); /* 1C */
iowrite16(0x0000, sdev->base + SCOOP_IRM);
}
......
......@@ -312,16 +312,16 @@ static void generate_ucode(u8 *ucode, u32 *gpr_a, u32 *gpr_b)
b1 = (gpr_a[i] >> 8) & 0xff;
b0 = gpr_a[i] & 0xff;
// immed[@ai, (b1 << 8) | b0]
// 11110000 0000VVVV VVVV11VV VVVVVV00 1IIIIIII
/* immed[@ai, (b1 << 8) | b0] */
/* 11110000 0000VVVV VVVV11VV VVVVVV00 1IIIIIII */
ucode[offset++] = 0xf0;
ucode[offset++] = (b1 >> 4);
ucode[offset++] = (b1 << 4) | 0x0c | (b0 >> 6);
ucode[offset++] = (b0 << 2);
ucode[offset++] = 0x80 | i;
// immed_w1[@ai, (b3 << 8) | b2]
// 11110100 0100VVVV VVVV11VV VVVVVV00 1IIIIIII
/* immed_w1[@ai, (b3 << 8) | b2] */
/* 11110100 0100VVVV VVVV11VV VVVVVV00 1IIIIIII */
ucode[offset++] = 0xf4;
ucode[offset++] = 0x40 | (b3 >> 4);
ucode[offset++] = (b3 << 4) | 0x0c | (b2 >> 6);
......@@ -340,16 +340,16 @@ static void generate_ucode(u8 *ucode, u32 *gpr_a, u32 *gpr_b)
b1 = (gpr_b[i] >> 8) & 0xff;
b0 = gpr_b[i] & 0xff;
// immed[@bi, (b1 << 8) | b0]
// 11110000 0000VVVV VVVV001I IIIIII11 VVVVVVVV
/* immed[@bi, (b1 << 8) | b0] */
/* 11110000 0000VVVV VVVV001I IIIIII11 VVVVVVVV */
ucode[offset++] = 0xf0;
ucode[offset++] = (b1 >> 4);
ucode[offset++] = (b1 << 4) | 0x02 | (i >> 6);
ucode[offset++] = (i << 2) | 0x03;
ucode[offset++] = b0;
// immed_w1[@bi, (b3 << 8) | b2]
// 11110100 0100VVVV VVVV001I IIIIII11 VVVVVVVV
/* immed_w1[@bi, (b3 << 8) | b2] */
/* 11110100 0100VVVV VVVV001I IIIIII11 VVVVVVVV */
ucode[offset++] = 0xf4;
ucode[offset++] = 0x40 | (b3 >> 4);
ucode[offset++] = (b3 << 4) | 0x02 | (i >> 6);
......@@ -357,7 +357,7 @@ static void generate_ucode(u8 *ucode, u32 *gpr_a, u32 *gpr_b)
ucode[offset++] = b2;
}
// ctx_arb[kill]
/* ctx_arb[kill] */
ucode[offset++] = 0xe0;
ucode[offset++] = 0x00;
ucode[offset++] = 0x01;
......
......@@ -17,6 +17,8 @@ CONFIG_MODVERSIONS=y
CONFIG_ARCH_DAVINCI=y
CONFIG_ARCH_DAVINCI_DA830=y
CONFIG_ARCH_DAVINCI_DA850=y
CONFIG_MACH_MITYOMAPL138=y
CONFIG_MACH_OMAPL138_HAWKBOARD=y
CONFIG_DAVINCI_RESET_CLOCKS=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
......@@ -79,6 +81,7 @@ CONFIG_I2C_DAVINCI=y
# CONFIG_HWMON is not set
CONFIG_WATCHDOG=y
CONFIG_REGULATOR=y
CONFIG_REGULATOR_DUMMY=y
CONFIG_REGULATOR_TPS6507X=y
CONFIG_FB=y
CONFIG_FB_DA8XX=y
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
......@@ -8,7 +8,7 @@
* published by the Free Software Foundation.
*
* Support functions for calculating clocks/divisors for the ICST
* clock generators. See http://www.icst.com/ for more information
* clock generators. See http://www.idt.com/ for more information
* on these devices.
*/
#ifndef ASMARM_HARDWARE_ICST_H
......
......@@ -35,9 +35,9 @@ extern void kunmap_high_l1_vipt(struct page *page, pte_t saved_pte);
#ifdef CONFIG_HIGHMEM
extern void *kmap(struct page *page);
extern void kunmap(struct page *page);
extern void *kmap_atomic(struct page *page, enum km_type type);
extern void kunmap_atomic_notypecheck(void *kvaddr, enum km_type type);
extern void *kmap_atomic_pfn(unsigned long pfn, enum km_type type);
extern void *__kmap_atomic(struct page *page);
extern void __kunmap_atomic(void *kvaddr);
extern void *kmap_atomic_pfn(unsigned long pfn);
extern struct page *kmap_atomic_to_page(const void *ptr);
#endif
......
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册