提交 · 7d43c2e42cb1e436f97c1763150e4e1122ae0d57 · openanolis / cloud-kernel

25 6月, 2012 2 次提交

由 Alex Williamson 提交于 5月 30, 2012

The iommu=group_mf is really no longer needed with the addition of ACS
support in IOMMU drivers creating groups.  Most multifunction devices
will now be grouped already.  If a device has gone to the trouble of
exposing ACS, trust that it works.  We can use the device specific ACS
function for fixing devices we trust individually.  This largely
reverts bcb71abe.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

7d43c2e4

iommu: IOMMU Groups · d72e31c9

由 Alex Williamson 提交于 5月 30, 2012

IOMMU device groups are currently a rather vague associative notion
with assembly required by the user or user level driver provider to
do anything useful.  This patch intends to grow the IOMMU group concept
into something a bit more consumable.

To do this, we first create an object representing the group, struct
iommu_group.  This structure is allocated (iommu_group_alloc) and
filled (iommu_group_add_device) by the iommu driver.  The iommu driver
is free to add devices to the group using it's own set of policies.
This allows inclusion of devices based on physical hardware or topology
limitations of the platform, as well as soft requirements, such as
multi-function trust levels or peer-to-peer protection of the
interconnects.  Each device may only belong to a single iommu group,
which is linked from struct device.iommu_group.  IOMMU groups are
maintained using kobject reference counting, allowing for automatic
removal of empty, unreferenced groups.  It is the responsibility of
the iommu driver to remove devices from the group
(iommu_group_remove_device).

IOMMU groups also include a userspace representation in sysfs under
/sys/kernel/iommu_groups.  When allocated, each group is given a
dynamically assign ID (int).  The ID is managed by the core IOMMU group
code to support multiple heterogeneous iommu drivers, which could
potentially collide in group naming/numbering.  This also keeps group
IDs to small, easily managed values.  A directory is created under
/sys/kernel/iommu_groups for each group.  A further subdirectory named
"devices" contains links to each device within the group.  The iommu_group
file in the device's sysfs directory, which formerly contained a group
number when read, is now a link to the iommu group.  Example:

$ ls -l /sys/kernel/iommu_groups/26/devices/
total 0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:00:1e.0 ->
		../../../../devices/pci0000:00/0000:00:1e.0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:06:0d.0 ->
		../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.0
lrwxrwxrwx. 1 root root 0 Apr 17 12:57 0000:06:0d.1 ->
		../../../../devices/pci0000:00/0000:00:1e.0/0000:06:0d.1

$ ls -l  /sys/kernel/iommu_groups/26/devices/*/iommu_group
[truncating perms/owner/timestamp]
/sys/kernel/iommu_groups/26/devices/0000:00:1e.0/iommu_group ->
					../../../kernel/iommu_groups/26
/sys/kernel/iommu_groups/26/devices/0000:06:0d.0/iommu_group ->
					../../../../kernel/iommu_groups/26
/sys/kernel/iommu_groups/26/devices/0000:06:0d.1/iommu_group ->
					../../../../kernel/iommu_groups/26

Groups also include several exported functions for use by user level
driver providers, for example VFIO.  These include:

iommu_group_get(): Acquires a reference to a group from a device
iommu_group_put(): Releases reference
iommu_group_for_each_dev(): Iterates over group devices using callback
iommu_group_[un]register_notifier(): Allows notification of device add
        and remove operations relevant to the group
iommu_group_id(): Return the group number

This patch also extends the IOMMU API to allow attaching groups to
domains.  This is currently a simple wrapper for iterating through
devices within a group, but it's expected that the IOMMU API may
eventually make groups a more integral part of domains.

Groups intentionally do not try to manage group ownership.  A user
level driver provider must independently acquire ownership for each
device within a group before making use of the group as a whole.
This may change in the future if group usage becomes more pervasive
across both DMA and IOMMU ops.

Groups intentionally do not provide a mechanism for driver locking
or otherwise manipulating driver matching/probing of devices within
the group.  Such interfaces are generic to devices and beyond the
scope of IOMMU groups.  If implemented, user level providers have
ready access via iommu_group_for_each_dev and group notifiers.

iommu_device_group() is removed here as it has no users.  The
replacement is:

	group = iommu_group_get(dev);
	id = iommu_group_id(group);
	iommu_group_put(group);

AMD-Vi & Intel VT-d support re-added in following patches.
Signed-off-by: NAlex Williamson <alex.williamson@redhat.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

d72e31c9

07 6月, 2012 1 次提交

stmmac: update driver's doc · 3d237714

由 Giuseppe CAVALLARO 提交于 6月 04, 2012

Fixed the driver's documentation that was obsolete and didn't
report new platform fields (recently added).
Signed-off-by: NGiuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

3d237714

04 6月, 2012 1 次提交

i2c: Add generic I2C multiplexer using pinctrl API · ae58d1e4

由 Stephen Warren 提交于 5月 18, 2012

This is useful for SoCs whose I2C module's signals can be routed to
different sets of pins at run-time, using the pinctrl API.

                                 +-----+  +-----+
                                 | dev |  | dev |
    +------------------------+   +-----+  +-----+
    | SoC                    |      |        |
    |                   /----|------+--------+
    |   +---+   +------+     | child bus A, on first set of pins
    |   |I2C|---|Pinmux|     |
    |   +---+   +------+     | child bus B, on second set of pins
    |                   \----|------+--------+--------+
    |                        |      |        |        |
    +------------------------+  +-----+  +-----+  +-----+
                                | dev |  | dev |  | dev |
                                +-----+  +-----+  +-----+
Signed-off-by: NStephen Warren <swarren@nvidia.com>
Acked-by: NLinus Walleij <linus.walleij@linaro.org>
Acked-by: NRob Herring <rob.herring@calxeda.com>
Signed-off-by: NWolfram Sang <w.sang@pengutronix.de>

ae58d1e4

03 6月, 2012 1 次提交

dm thin: provide userspace access to pool metadata · cc8394d8

由 Joe Thornber 提交于 6月 03, 2012

This patch implements two new messages that can be sent to the thin
pool target allowing it to take a snapshot of the _metadata_.  This,
read-only snapshot can be accessed by userland, concurrently with the
live target.

Only one metadata snapshot can be held at a time.  The pool's status
line will give the block location for the current msnap.

Since version 0.1.5 of the userland thin provisioning tools, the
thin_dump program displays the msnap as follows:

    thin_dump -m <msnap root> <metadata dev>

Available here: https://github.com/jthornber/thin-provisioning-tools

Now that userland can access the metadata we can do various things
that have traditionally been kernel side tasks:

     i) Incremental backups.

     By using metadata snapshots we can work out what blocks have
     changed over time.  Combined with data snapshots we can ensure
     the data doesn't change while we back it up.

     A short proof of concept script can be found here:

     https://github.com/jthornber/thinp-test-suite/blob/master/incremental_backup_example.rb

     ii) Migration of thin devices from one pool to another.

     iii) Merging snapshots back into an external origin.

     iv) Asyncronous replication.
Signed-off-by: NJoe Thornber <ejt@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

cc8394d8

02 6月, 2012 2 次提交

x86, efi: Add EFI boot stub documentation · 0c759662

由 Matt Fleming 提交于 3月 16, 2012

Since we can't expect every user to read the EFI boot stub code it
seems prudent to have a couple of paragraphs explaining what it is and
how it works.

The "initrd=" option in particular is tricky because it only
understands absolute EFI-style paths (backslashes as directory
separators), and until now this hasn't been documented anywhere. This
has tripped up a couple of users.

Cc: Matthew Garrett <mjg@redhat.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
Link: http://lkml.kernel.org/r/1331907517-3985-4-git-send-email-matt@console-pimps.orgSigned-off-by: NH. Peter Anvin <hpa@zytor.com>

0c759662

fs: introduce inode operation ->update_time · c3b2da31

由 Josef Bacik 提交于 3月 26, 2012

Btrfs has to make sure we have space to allocate new blocks in order to modify
the inode, so updating time can fail.  We've gotten around this by having our
own file_update_time but this is kind of a pain, and Christoph has indicated he
would like to make xfs do something different with atime updates.  So introduce
->update_time, where we will deal with i_version an a/m/c time updates and
indicate which changes need to be made.  The normal version just does what it
has always done, updates the time and marks the inode dirty, and then
filesystems can choose to do something different.

I've gone through all of the users of file_update_time and made them check for
errors with the exception of the fault code since it's complicated and I wasn't
quite sure what to do there, also Jan is going to be pushing the file time
updates into page_mkwrite for those who have it so that should satisfy btrfs and
make it not a big deal to check the file_update_time() return code in the
generic fault path. Thanks,
Signed-off-by: NJosef Bacik <josef@redhat.com>

c3b2da31

01 6月, 2012 5 次提交

c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat · 5b172087

由 Cyrill Gorcunov 提交于 5月 31, 2012

We would like to have an ability to restore command line arguments and
program environment pointers but first we need to obtain them somehow.
Thus we put these values into /proc/$pid/stat.  The exit_code is needed to
restore zombie tasks.
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Acked-by: NKees Cook <keescook@chromium.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5b172087

fs, proc: introduce /proc/<pid>/task/<tid>/children entry · 81841161

由 Cyrill Gorcunov 提交于 5月 31, 2012

When we do checkpoint of a task we need to know the list of children the
task, has but there is no easy and fast way to generate reverse
parent->children chain from arbitrary <pid> (while a parent pid is
provided in "PPid" field of /proc/<pid>/status).

So instead of walking over all pids in the system (creating one big
process tree in memory, just to figure out which children a task has) --
we add explicit /proc/<pid>/task/<tid>/children entry, because the kernel
already has this kind of information but it is not yet exported.

This is a first level children, not the whole process tree.
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: NOleg Nesterov <oleg@redhat.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

81841161

mqueue: separate mqueue default value from maximum value · cef0184c

由 KOSAKI Motohiro 提交于 5月 31, 2012

Commit b231cca4 ("message queues: increase range limits") changed
mqueue default value when attr parameter is specified NULL from hard
coded value to fs.mqueue.{msg,msgsize}_max sysctl value.

This made large side effect.  When user need to use two mqueue
applications 1) using !NULL attr parameter and it require big message
size and 2) using NULL attr parameter and only need small size message,
app (1) require to raise fs.mqueue.msgsize_max and app (2) consume large
memory size even though it doesn't need.

Doug Ledford propsed to switch back it to static hard coded value.
However it also has a compatibility problem.  Some applications might
started depend on the default value is tunable.

The solution is to separate default value from maximum value.
Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NDoug Ledford <dledford@redhat.com>
Acked-by: NDoug Ledford <dledford@redhat.com>
Acked-by: NJoe Korty <joe.korty@ccur.com>
Cc: Amerigo Wang <amwang@redhat.com>
Acked-by: NSerge E. Hallyn <serue@us.ibm.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cef0184c

proc: report file/anon bit in /proc/pid/pagemap · 052fb0d6

由 Konstantin Khlebnikov 提交于 5月 31, 2012

This is an implementation of Andrew's proposal to extend the pagemap file
bits to report what is missing about tasks' working set.

The problem with the working set detection is multilateral.  In the criu
(checkpoint/restore) project we dump the tasks' memory into image files
and to do it properly we need to detect which pages inside mappings are
really in use.  The mincore syscall I though could help with this did not.
 First, it doesn't report swapped pages, thus we cannot find out which
parts of anonymous mappings to dump.  Next, it does report pages from page
cache as present even if they are not mapped, and it doesn't make that has
not been cow-ed.

Note, that issue with swap pages is critical -- we must dump swap pages to
image file.  But the issues with file pages are optimization -- we can
take all file pages to image, this would be correct, but if we know that a
page is not mapped or not cow-ed, we can remove them from dump file.  The
dump would still be self-consistent, though significantly smaller in size
(up to 10 times smaller on real apps).

Andrew noticed, that the proc pagemap file solved 2 of 3 above issues --
it reports whether a page is present or swapped and it doesn't report not
mapped page cache pages.  But, it doesn't distinguish cow-ed file pages
from not cow-ed.

I would like to make the last unused bit in this file to report whether the
page mapped into respective pte is PageAnon or not.

[comment stolen from Pavel Emelyanov's v1 patch]
Signed-off-by: NKonstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

052fb0d6

CodingStyle: add kmalloc_array() to memory allocators · 15837294

由 Xi Wang 提交于 5月 31, 2012

Add the new kmalloc_array() to the list of general-purpose memory
allocators in chapter 14.
Signed-off-by: NXi Wang <xi.wang@gmail.com>
Acked-by: NJesper Juhl <jj@chaosbits.net>
Acked-by: NPekka Enberg <penberg@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

15837294

31 5月, 2012 1 次提交

mtip32xx: Changes to sysfs entries · b77874c9

由 Asai Thambi S P 提交于 5月 29, 2012

* Formatted the output of 'registers' entry
* Added "Commands in Q' to output of 'registers' entry
* Added a new entry 'flags'
Signed-off-by: NAsai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

b77874c9

30 5月, 2012 18 次提交

Documentation: kernel-parameters.txt Add amd_iommu_dump · c099cf17

由 Shuah Khan 提交于 5月 24, 2012

Add amd_iommu_dump to kernel-parameters.txt
Signed-off-by: NShuah Khan <shuahkhan@gmail.com>
Signed-off-by: NJoerg Roedel <joerg.roedel@amd.com>

c099cf17

i2c: Split I2C_M_NOSTART support out of I2C_FUNC_PROTOCOL_MANGLING · 14674e70

由 Mark Brown 提交于 5月 30, 2012

Since there are uses for I2C_M_NOSTART which are much more sensible and
standard than most of the protocol mangling functionality (the main one
being gather writes to devices where something like a register address
needs to be inserted before a block of data) create a new I2C_FUNC_NOSTART
for this feature and update all the users to use it.

Also strengthen the disrecommendation of the protocol mangling while we're
at it.

In the case of regmap-i2c we remove the requirement for mangling as
I2C_M_NOSTART is the only mangling feature which is being used.
Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: NWolfram Sang <w.sang@pengutronix.de>
Signed-off-by: NJean Delvare <khali@linux-fr.org>

14674e70

Watchdog: DA9052/53 PMIC watchdog support · 664a0d78

由 Ashish Jangam 提交于 5月 24, 2012

This driver adds support for the watchdog functionality provided by
the Dialog Semiconductor DA9052 PMIC chip.

Tested on samsung smdkv6410 and i.mx53 QS boards.
Signed-off-by: NAnthony Olech <Anthony.Olech@diasemi.com>
Signed-off-by: NAshish Jangam <ashish.jangam@kpitcummins.com>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

664a0d78

watchdog: Add support for dynamically allocated watchdog_device structs · e907df32

由 Hans de Goede 提交于 5月 22, 2012

If a driver's watchdog_device struct is part of a dynamically allocated
struct (which it often will be), merely locking the module is not enough,
even with a drivers module locked, the driver can be unbound from the device,
examples:
1) The root user can unbind it through sysfd
2) The i2c bus master driver being unloaded for an i2c watchdog

I will gladly admit that these are corner cases, but we still need to handle
them correctly.

The fix for this consists of 2 parts:
1) Add ref / unref operations, so that the driver can refcount the struct
   holding the watchdog_device struct and delay freeing it until any
   open filehandles referring to it are closed
2) Most driver operations will do IO on the device and the driver should not
   do any IO on the device after it has been unbound. Rather then letting each
   driver deal with this internally, it is better to ensure at the watchdog
   core level that no operations (other then unref) will get called after
   the driver has called watchdog_unregister_device(). This actually is the
   bulk of this patch.
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

e907df32

watchdog: Add Locking support · f4e9c82f

由 Hans de Goede 提交于 5月 22, 2012

This patch fixes some potential multithreading issues, despite only
allowing one process to open the /dev/watchdog device, we can still get
called multiple times at the same time, since a program could be using thread,
or could share the fd after a fork.

This causes 2 potential problems:
1) watchdog_start / open do an unlocked test_n_set / test_n_clear,
   if these 2 race, the watchdog could be stopped while the active
   bit indicates it is running or visa versa.

2) Most watchdog_dev drivers probably assume that only one
   watchdog-op will get called at a time, this is not necessary
   true atm.
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

f4e9c82f

watchdog: create all the proper device files · d6b469d9

由 Alan Cox 提交于 5月 11, 2012

Create the watchdog class and it's associated devices.
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

d6b469d9

watchdog: Add multiple device support · 45f5fed3

由 Alan Cox 提交于 5月 10, 2012

We keep the old /dev/watchdog interface file for the first watchdog via
miscdev. This is basically a cut and paste of the relevant interface code
from the rtc driver layer tweaked for watchdog.

Revised to fix problems noted by Hans de Goede
Signed-off-by: NAlan Cox <alan@linux.intel.com>
Signed-off-by: NHans de Goede <hdegoede@redhat.com>
Signed-off-by: NTomas Winkler <tomas.winkler@intel.com>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

45f5fed3

drivers/rtc/rtc-lpc32xx.c: add device tree support · e862e7c4

由 Roland Stigge 提交于 5月 29, 2012

Adds device tree support for rtc-lpc32xx.c
Signed-off-by: NRoland Stigge <stigge@antcom.de>
Acked-by: NRob Herring <rob.herring@calxeda.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e862e7c4

rtc/spear: add Device Tree probing capability · 0108c4ff

由 Viresh Kumar 提交于 5月 29, 2012

SPEAr platforms now support DT and so must convert all drivers support DT.
This patch adds DT probing support for rtc and updates its documentation
too.
Signed-off-by: NViresh Kumar <viresh.kumar@st.com>
Cc: Stefan Roese <sr@denx.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Rajeev Kumar <rajeev-dlh.kumar@st.com>
Cc: Rob Herring <robherring2@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

0108c4ff

leds: add LM3533 LED driver · 401dea7f

由 Johan Hovold 提交于 5月 29, 2012

Add sub-driver for the LEDs on National Semiconductor / TI LM3533 lighting
power chips.

The chip provides 256 brightness levels, hardware accelerated blinking as
well as ambient-light-sensor and pwm input control.
Signed-off-by: NJohan Hovold <jhovold@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Rob Landley <rob@landley.net>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Jonathan Cameron <jic23@cam.ac.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Bryan Wu <bryan.wu@canonical.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

401dea7f

leds: add new transient trigger for one shot timer activation · 44e1e9f8

由 Shuah Khan 提交于 5月 29, 2012

The leds timer trigger does not currently have an interface to activate a
one shot timer.  The current support allows for setting two timers, one
for specifying how long a state to be on, and the second for how long the
state to be off.  The delay_on value specifies the time period an LED
should stay in on state, followed by a delay_off value that specifies how
long the LED should stay in off state.  The on and off cycle repeats until
the trigger gets deactivated.  There is no provision for one time
activation to implement features that require an on or off state to be
held just once and then stay in the original state forever.

Without one shot timer interface, user space can still use timer trigger
to set a timer to hold a state, however when user space application
crashes or goes away without deactivating the timer, the hardware will be
left in that state permanently.

As a specific example of this use-case, let's look at vibrate feature on
phones.  Vibrate function on phones is implemented using PWM pins on SoC
or PMIC.  There is a need to activate one shot timer to control the
vibrate feature, to prevent user space crashes leaving the phone in
vibrate mode permanently causing the battery to drain.

This trigger exports three properties, activate, state, and duration When
transient trigger is activated these properties are set to default values.

- duration allows setting timer value in msecs. The initial value is 0.
- activate allows activating and deactivating the timer specified by
  duration as needed. The initial and default value is 0.  This will allow
  duration to be set after trigger activation.
- state allows user to specify a transient state to be held for the specified
  duration.
Signed-off-by: NShuah Khan <shuahkhan@gmail.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: NeilBrown <neilb@suse.de>
Cc: Bryan Wu <bryan.wu@canonical.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

44e1e9f8

backlight: add LM3533 backlight driver · 7f26c970

由 Johan Hovold 提交于 5月 29, 2012

Add sub-driver for the backlights on National Semiconductor / TI LM3533
lighting power chips.

The chip provides 256 brightness levels and ambient-light-sensor and pwm
input control.

[akpm@linux-foundation.org: fix warning]
[akpm@linux-foundation.org: fix the type of `mode']
Signed-off-by: NJohan Hovold <jhovold@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Rob Landley <rob@landley.net>
Cc: Samuel Ortiz <sameo@linux.intel.com>
Cc: Jonathan Cameron <jic23@cam.ac.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7f26c970

memcg: move charges to root cgroup if use_hierarchy=0 · cc926f78

由 KAMEZAWA Hiroyuki 提交于 5月 29, 2012

Presently, at removal of cgroup, ->pre_destroy() is called and moves
charges to the parent cgroup.  A major reason for returning -EBUSY from
->pre_destroy() is that the 'moving' hits the parent's resource
limitation.  It happens only when use_hierarchy=0.

Considering use_hierarchy=0, all cgroups should be flat.  So, no one
cannot justify moving charges to parent...parent and children are in flat
configuration, not hierarchical.

This patch modifes the code to move charges to the root cgroup at
rmdir/force_empty if use_hierarchy==0.  This will much simplify rmdir()
and reduce error in ->pre_destroy.
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ying Han <yinghan@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Reviewed-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cc926f78

rescounters: add res_counter_uncharge_until() · 2bb2ba9d

由 Frederic Weisbecker 提交于 5月 29, 2012

When killing a res_counter which is a child of other counter, we need to
do

	res_counter_uncharge(child, xxx)
	res_counter_charge(parent, xxx)

This is not atomic and wastes CPU.  This patch adds
res_counter_uncharge_until().  This function's uncharge propagates to
ancestors until specified res_counter.

	res_counter_uncharge_until(child, parent, xxx)

Now the operation is atomic and efficient.
Signed-off-by: NFrederic Weisbecker <fweisbec@redhat.com>
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Ying Han <yinghan@google.com>
Cc: Glauber Costa <glommer@parallels.com>
Reviewed-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2bb2ba9d

memcg: fix/change behavior of shared anon at moving task · 4b91355e

由 KAMEZAWA Hiroyuki 提交于 5月 29, 2012

This patch changes memcg's behavior at task_move().

At task_move(), the kernel scans a task's page table and move the changes
for mapped pages from source cgroup to target cgroup.  There has been a
bug at handling shared anonymous pages for a long time.

Before patch:
  - The spec says 'shared anonymous pages are not moved.'
  - The implementation was 'shared anonymoys pages may be moved'.
    If page_mapcount <=2, shared anonymous pages's charge were moved.

After patch:
  - The spec says 'all anonymous pages are moved'.
  - The implementation is 'all anonymous pages are moved'.

Considering usage of memcg, this will not affect user's experience.
'shared anonymous' pages only exists between a tree of processes which
don't do exec().  Moving one of process without exec() seems not sane.
For example, libcgroup will not be affected by this change.  (Anyway, no
one noticed the implementation for a long time...)

Below is a discussion log:

 - current spec/implementation are complex
 - Now, shared file caches are moved
 - It adds unclear check as page_mapcount(). To do correct check,
   we should check swap users, etc.
 - No one notice this implementation behavior. So, no one get benefit
   from the design.
 - In general, once task is moved to a cgroup for running, it will not
   be moved....
 - Finally, we have control knob as memory.move_charge_at_immigrate.

Here is a patch to allow moving shared pages, completely. This makes
memcg simpler and fix current broken code.
Suggested-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4b91355e

mm: document the meminfo and vmstat fields of relevance to transparent hugepages · 69256994

由 Mel Gorman 提交于 5月 29, 2012

Update Documentation/vm/transhuge.txt and
Documentation/filesystems/proc.txt with some information on monitoring
transparent huge page usage and the associated overhead.
Signed-off-by: NMel Gorman <mgorman@suse.de>
Signed-off-by: NAndrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

69256994

mm/fs: remove truncate_range · 17cf28af

由 Hugh Dickins 提交于 5月 29, 2012

Remove vmtruncate_range(), and remove the truncate_range method from
struct inode_operations: only tmpfs ever supported it, and tmpfs has now
converted over to using the fallocate method of file_operations.

Update Documentation accordingly, adding (setlease and) fallocate lines.
And while we're in mm.h, remove duplicate declarations of shmem_lock() and
shmem_file_setup(): everyone is now using the ones in shmem_fs.h.
Based-on-patch-by: NCong Wang <amwang@redhat.com>
Signed-off-by: NHugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Cong Wang <amwang@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

17cf28af

Documentation: memcg: future proof hierarchical statistics documentation · eb6332a5

由 Johannes Weiner 提交于 5月 29, 2012

The hierarchical versions of per-memcg counters in memory.stat are all
calculated the same way and are all named total_<counter>.

Documenting the pattern is easier for maintenance than listing each
counter twice.
Signed-off-by: NJohannes Weiner <hannes@cmpxchg.org>
Acked-by: NMichal Hocko <mhocko@suse.cz>
Acked-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: NYing Han <yinghan@google.com>
Randy Dunlap <rdunlap@xenotime.net>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

eb6332a5

25 5月, 2012 8 次提交

doc: ext3: update documentation with barrier=1 default · 21848d08

由 Stefan Hajnoczi 提交于 3月 28, 2012

Commit 00eacd66 ("ext3: make ext3 mount default to barrier=1") changed
the default barrier mount option for ext3.  The documentation needs to
be updated, so this patch does that.
Signed-off-by: NStefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Acked-by: NRob Landley <rob@landley.net>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

21848d08

Documentation/initrd.txt: Change the location of util-linux · 76dab97c

由 Marcos Paulo de Souza 提交于 4月 12, 2012

The address of util-linux seems deprecated. The new util-linux location is
in the kernel.org. So, change this for the correct address.
Signed-off-by: NMarcos Paulo de Souza <marcos.souza.org@gmail.com>
Acked-by: NRob Landley <rob@landley.net>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

76dab97c

Documentation/SubmittingPatches: suggested the use of scripts/get_maintainer.pl · e52d2e1f

由 Michel Machado 提交于 4月 02, 2012

Had I found a reference to scripts/get_maintainer.pl when I first read
Documentation/SubmittingPatches, it would've saved me some time.
Signed-off-by: NMichel Machado <michel@digirati.com.br>
Acked-by: NRob Landley <rob@landley.net>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

e52d2e1f

Documentation/kernel-parameters: remove autotest and mcatest · 9b170dbd

由 Sebastian Andrzej Siewior 提交于 3月 26, 2012

It has no more users, the last one is gone in "[PATCH] ia64: Kconfig
cleanup" aka ("6fd79ab50b").
mcatest is gone in commit "[PATCH] ia64: SGI SN update"
("c6bacd5010ec").

Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: NSebastian Andrzej Siewior <sebastian@breakpoint.cc>
Acked-by: NRob Landley <rob@landley.net>
Signed-off-by: NJiri Kosina <jkosina@suse.cz>

9b170dbd

dma-buf: add initial vmap documentation · b25b086d

由 Dave Airlie 提交于 5月 22, 2012

Signed-off-by: NDave Airlie <airlied@redhat.com>
Signed-off-by: NSumit Semwal <sumit.semwal@linaro.org>

b25b086d

dma-buf: mmap support · 4c78513e

由 Daniel Vetter 提交于 4月 24, 2012

Compared to Rob Clark's RFC I've ditched the prepare/finish hooks
and corresponding ioctls on the dma_buf file. The major reason for
that is that many people seem to be under the impression that this is
also for synchronization with outstanding asynchronous processsing.
I'm pretty massively opposed to this because:

- It boils down reinventing a new rather general-purpose userspace
  synchronization interface. If we look at things like futexes, this
  is hard to get right.
- Furthermore a lot of kernel code has to interact with this
  synchronization primitive. This smells a look like the dri1 hw_lock,
  a horror show I prefer not to reinvent.
- Even more fun is that multiple different subsystems would interact
  here, so we have plenty of opportunities to create funny deadlock
  scenarios.

I think synchronization is a wholesale different problem from data
sharing and should be tackled as an orthogonal problem.

Now we could demand that prepare/finish may only ensure cache
coherency (as Rob intended), but that runs up into the next problem:
We not only need mmap support to facilitate sw-only processing nodes
in a pipeline (without jumping through hoops by importing the dma_buf
into some sw-access only importer), which allows for a nicer
ION->dma-buf upgrade path for existing Android userspace. We also need
mmap support for existing importing subsystems to support existing
userspace libraries. And a loot of these subsystems are expected to
export coherent userspace mappings.

So prepare/finish can only ever be optional and the exporter /needs/
to support coherent mappings. Given that mmap access is always
somewhat fallback-y in nature I've decided to drop this optimization,
instead of just making it optional. If we demonstrate a clear need for
this, supported by benchmark results, we can always add it in again
later as an optional extension.

Other differences compared to Rob's RFC is the above mentioned support
for mapping a dma-buf through facilities provided by the importer.
Which results in mmap support no longer being optional.

Note that this dma-buf mmap patch does _not_ support every possible
insanity an existing subsystem could pull of with mmap: Because it
does not allow to intercept pagefaults and shoot down ptes importing
subsystems can't add some magic of their own at these points (e.g. to
automatically synchronize with outstanding rendering or set up some
special resources). I've done a cursory read through a few mmap
implementions of various subsytems and I'm hopeful that we can avoid
this (and the complexity it'd bring with it).

Additonally I've extended the documentation a bit to explain the hows
and whys of this mmap extension.

In case we ever want to add support for explicitly cache maneged
userspace mmap with a prepare/finish ioctl pair, we could specify that
userspace needs to mmap a different part of the dma_buf, e.g. the
range starting at dma_buf->size up to dma_buf->size*2. This works
because the size of a dma_buf is invariant over it's lifetime. The
exporter would obviously need to fall back to coherent mappings for
both ranges if a legacy clients maps the coherent range and the
architecture cannot suppor conflicting caching policies. Also, this
would obviously be optional and userspace needs to be able to fall
back to coherent mappings.

v2:
- Spelling fixes from Rob Clark.
- Compile fix for !DMA_BUF from Rob Clark.
- Extend commit message to explain how explicitly cache managed mmap
  support could be added later.
- Extend the documentation with implementations notes for exporters
  that need to manually fake coherency.

v3:
- dma_buf pointer initialization goof-up noticed by Rebecca Schultz
  Zavin.

Cc: Rob Clark <rob.clark@linaro.org>
Cc: Rebecca Schultz Zavin <rebecca@android.com>
Acked-by: NRob Clark <rob.clark@linaro.org>
Signed-Off-by: NDaniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: NSumit Semwal <sumit.semwal@linaro.org>

4c78513e

tick: Add tick skew boot option · 5307c955

由 Mike Galbraith 提交于 5月 08, 2012

Let the user decide whether power consumption or jitter is the
more important consideration for their machines.

Quoting removal commit af5ab277:

"Historically, Linux has tried to make the regular timer tick on the
 various CPUs not happen at the same time, to avoid contention on
 xtime_lock.
    
 Nowadays, with the tickless kernel, this contention no longer happens
 since time keeping and updating are done differently. In addition,
 this skew is actually hurting power consumption in a measurable way on
 many-core systems."

Problems:

- Contrary to the above, systems do encounter contention on both
  xtime_lock and RCU structure locks when the tick is synchronized.
  
- Moderate sized RT systems suffer intolerable jitter due to the tick
  being synchronized.

- SGI reports the same for their large systems.

- Fully utilized systems reap no power saving benefit from skew removal,
  but do suffer from resulting induced lock contention.

- 0209f649 rcu: limit rcu_node leaf-level fanout
  This patch was born to combat lock contention which testing showed
  to have been _induced by_ skew removal.  Skew the tick, contention
  disappeared virtually completely.
Signed-off-by: NMike Galbraith <mgalbraith@suse.de>
Link: http://lkml.kernel.org/r/1336472458.21924.78.camel@marge.simpson.netSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

5307c955

net/wanrouter: Deprecate and schedule for removal · f0d1b3c2

由 Joe Perches 提交于 5月 24, 2012

No one uses this on current kernels anymore.

Let it be known it's going to be removed eventually.
Signed-off-by: NJoe Perches <joe@perches.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f0d1b3c2

23 5月, 2012 1 次提交

Documentation/watchdog: Fix the file descriptor leak when no cmdline arg given · cad19fa6

由 Devendra Naga 提交于 5月 17, 2012

we start a infinite loop when user gives ./watchdog-test, and when user
ctrl + c's the program, we just exit immeadiately with out closing the
filedescriptor of the watchdog device. a signal handler is used to
do the job of closing the filedescriptor and exiting the program.
Signed-off-by: NDevendra Naga <devendra.aaru@gmail.com>
Signed-off-by: NWim Van Sebroeck <wim@iguana.be>

cad19fa6

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功