提交 · 02d0a752460ea5dab34ce36c9ddc9c682e846a0d · openeuler / raspberrypi-kernel

18 1月, 2014 1 次提交

kernfs: add struct dentry declaration in kernfs.h · 917f56ca

由 Tejun Heo 提交于 1月 17, 2014

Hello, Greg.

Two misc fixes for kernfs.

Thanks.
------- 8< -------
struct dentry is used in kernfs.h but its declaration was missing,
leading to compilation errors unless its declaration gets pulled in in
some other way.  Add the declaration.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

917f56ca

16 1月, 2014 1 次提交

i2c: pnx: Use devm_*() functions · d1ccc125

由 Jingoo Han 提交于 1月 14, 2014

Use devm_*() functions to make cleanup paths simpler,
and remove redundant return value check of platform_get_resource()
because the value is checked by devm_ioremap_resource().
Signed-off-by: NJingoo Han <jg1.han@samsung.com>
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>

d1ccc125

15 1月, 2014 5 次提交

crash_dump: fix compilation error (on MIPS at least) · 5a610fcc

由 Qais Yousef 提交于 1月 14, 2014

  In file included from kernel/crash_dump.c:2:0:
  include/linux/crash_dump.h:22:27: error: unknown type name `pgprot_t'

when CONFIG_CRASH_DUMP=y

The error was traced back to commit 9cb21813 ("vmcore: introduce
remap_oldmem_pfn_range()")

include <asm/pgtable.h> to get the missing definition
Signed-off-by: NQais Yousef <qais.yousef@imgtec.com>
Reviewed-by: NJames Hogan <james.hogan@imgtec.com>
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: NVivek Goyal <vgoyal@redhat.com>
Cc: <stable@vger.kernel.org>	[3.12+]
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5a610fcc

hwmon: (sht15) add include guard · ffcc3b2a

由 Vivien Didelot 提交于 1月 10, 2014

Add include guard to include/linux/platform_data/sht15.h to prevent
multiple inclusion.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>

ffcc3b2a

hwmon: (max197) add include guard · 4cb6409b

由 Vivien Didelot 提交于 1月 10, 2014

Add include guard to include/linux/platform_data/max197.h to prevent
multiple inclusion.
Signed-off-by: NVivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>

4cb6409b

hwmon: (s3c) Trivial cleanup in hwmon-s3c.h · 2ac1dfc5

由 Sachin Kamat 提交于 12月 27, 2013

Commit 436d42c6 ("ARM: samsung: move platform_data definitions")
moved the file to the current location but forgot to remove the pointer
to its previous location. Clean it up. While at it also change the header
file protection macros appropriately.
Signed-off-by: NSachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: NGuenter Roeck <linux@roeck-us.net>

2ac1dfc5

i2c: Re-instate body of i2c_parent_is_i2c_adapter() · 2fac2b89

由 Stephen Warren 提交于 1月 13, 2014

The body of i2c_parent_is_i2c_adapter() is currently guarded by
I2C_MUX. It should be CONFIG_I2C_MUX instead.

Among potentially other problems, this resulted in i2c_lock_adapter()
only locking I2C mux child adapters, and not the parent adapter. In
turn, this could allow inter-mingling of mux child selection and I2C
transactions, which could result in I2C transactions being directed to
the wrong I2C bus, and possibly even switching between busses in the
middle of a transaction.

One concrete issue caused by this bug was corrupted HDMI EDID reads
during boot on the NVIDIA Tegra Seaboard system, although this only
became apparent in recent linux-next, when the boot timing was changed
just enough to trigger the race condition.

Fixes: 3923172b ("i2c: reduce parent checking to a NOOP in non-I2C_MUX case")
Cc: Phil Carmody <phil.carmody@partner.samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: NStephen Warren <swarren@nvidia.com>
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>

2fac2b89

14 1月, 2014 11 次提交

usb: chipidea: add freescale imx28 special write register method · ed8f8318

由 Peter Chen 提交于 1月 10, 2014

According to Freescale imx28 Errata, "ENGR119653 USB: ARM to USB
register error issue", All USB register write operations must
use the ARM SWP instruction. So, we implement special hw_write
and hw_test_and_clear for imx28.

Discussion for it at below:
http://marc.info/?l=linux-usb&m=137996395529294&w=2

This patch is needed for stable tree 3.11+.

Cc: stable@vger.kernel.org
Cc: robert.hodaszi@digi.com
Signed-off-by: NPeter Chen <peter.chen@freescale.com>
Signed-off-by: NMarc Kleine-Budde <mkl@pengutronix.de>
Tested-by: NMarc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ed8f8318

Revert "kernfs: replace kernfs_node->u.completion with kernfs_root->deactivate_waitq" · 87da1493

由 Greg Kroah-Hartman 提交于 1月 13, 2014

This reverts commit ea1c472d.

Tejun writes:
        I'm sorry but can you please revert the whole series?
        get_active() waiting while a node is deactivated has potential
        to lead to deadlock and that deactivate/reactivate interface is
        something fundamentally flawed and that cgroup will have to work
        with the remove_self() like everybody else.  IOW, I think the
        first posting was correct.

Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

87da1493

Revert "kernfs: remove KERNFS_ACTIVE_REF and add kernfs_lockdep()" · 0890147f

由 Greg Kroah-Hartman 提交于 1月 13, 2014

This reverts commit a69d001c.

Tejun writes:
        I'm sorry but can you please revert the whole series?
        get_active() waiting while a node is deactivated has potential
        to lead to deadlock and that deactivate/reactivate interface is
        something fundamentally flawed and that cgroup will have to work
        with the remove_self() like everybody else.  IOW, I think the
        first posting was correct.

Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

0890147f

Revert "kernfs: remove KERNFS_REMOVED" · 798c75a0

由 Greg Kroah-Hartman 提交于 1月 13, 2014

This reverts commit ae34372e.

Tejun writes:
        I'm sorry but can you please revert the whole series?
        get_active() waiting while a node is deactivated has potential
        to lead to deadlock and that deactivate/reactivate interface is
        something fundamentally flawed and that cgroup will have to work
        with the remove_self() like everybody else.  IOW, I think the
        first posting was correct.

Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

798c75a0

Revert "kernfs: restructure removal path to fix possible premature return" · 4f4b1b64

由 Greg Kroah-Hartman 提交于 1月 13, 2014

This reverts commit 45a140e5.

Tejun writes:
        I'm sorry but can you please revert the whole series?
        get_active() waiting while a node is deactivated has potential
        to lead to deadlock and that deactivate/reactivate interface is
        something fundamentally flawed and that cgroup will have to work
        with the remove_self() like everybody else.  IOW, I think the
        first posting was correct.

Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

4f4b1b64

Revert "kernfs: remove kernfs_addrm_cxt" · 7653fe9d

由 Greg Kroah-Hartman 提交于 1月 13, 2014

This reverts commit 99177a34.

Tejun writes:
        I'm sorry but can you please revert the whole series?
        get_active() waiting while a node is deactivated has potential
        to lead to deadlock and that deactivate/reactivate interface is
        something fundamentally flawed and that cgroup will have to work
        with the remove_self() like everybody else.  IOW, I think the
        first posting was correct.

Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

7653fe9d

Revert "kernfs: implement kernfs_{de|re}activate[_self]()" · 9b0925a6

由 Greg Kroah-Hartman 提交于 1月 13, 2014

This reverts commit 9f010c2a.

Tejun writes:
        I'm sorry but can you please revert the whole series?
        get_active() waiting while a node is deactivated has potential
        to lead to deadlock and that deactivate/reactivate interface is
        something fundamentally flawed and that cgroup will have to work
        with the remove_self() like everybody else.  IOW, I think the
        first posting was correct.

Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9b0925a6

Revert "kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers" · a9f138b0

由 Greg Kroah-Hartman 提交于 1月 13, 2014

This reverts commit 1ae06819.

Tejun writes:
        I'm sorry but can you please revert the whole series?
        get_active() waiting while a node is deactivated has potential
        to lead to deadlock and that deactivate/reactivate interface is
        something fundamentally flawed and that cgroup will have to work
        with the remove_self() like everybody else.  IOW, I think the
        first posting was correct.

Cc: Tejun Heo <tj@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

a9f138b0

Revert "sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()" · a30f82b7

由 Greg Kroah-Hartman 提交于 1月 13, 2014

This reverts commit d1ba277e.

Tejun writes:
        I'm sorry but can you please revert the whole series?
        get_active() waiting while a node is deactivated has potential
        to lead to deadlock and that deactivate/reactivate interface is
        something fundamentally flawed and that cgroup will have to work
        with the remove_self() like everybody else.  IOW, I think the
        first posting was correct.

Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

a30f82b7

sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding · 8cb75e0c

由 Peter Zijlstra 提交于 11月 20, 2013

With various drivers wanting to inject idle time; we get people
calling idle routines outside of the idle loop proper.

Therefore we need to be extra careful about not missing
TIF_NEED_RESCHED -> PREEMPT_NEED_RESCHED propagations.

While looking at this, I also realized there's a small window in the
existing idle loop where we can miss TIF_NEED_RESCHED; when it hits
right after the tif_need_resched() test at the end of the loop but
right before the need_resched() test at the start of the loop.

So move preempt_fold_need_resched() out of the loop where we're
guaranteed to have TIF_NEED_RESCHED set.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-x9jgh45oeayzajz2mjt0y7d6@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

8cb75e0c

sched/preempt, locking: Rework local_bh_{dis,en}able() · 0bd3a173

由 Peter Zijlstra 提交于 11月 19, 2013

Currently local_bh_disable() is out-of-line for no apparent reason.
So inline it to save a few cycles on call/return nonsense, the
function body is a single add on x86 (a few loads and store extra on
load/store archs).

Also expose two new local_bh functions:

  __local_bh_{dis,en}able_ip(unsigned long ip, unsigned int cnt);

Which implement the actual local_bh_{dis,en}able() behaviour.

The next patch uses the exposed @cnt argument to optimize bh lock
functions.

With build fixes from Jacob Pan.

Cc: rjw@rjwysocki.net
Cc: rui.zhang@intel.com
Cc: jacob.jun.pan@linux.intel.com
Cc: Mike Galbraith <bitbucket@online.de>
Cc: hpa@zytor.com
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: lenb@kernel.org
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131119151338.GF3694@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

0bd3a173

13 1月, 2014 11 次提交

sched/clock, x86: Use a static_key for sched_clock_stable · 35af99e6

由 Peter Zijlstra 提交于 11月 28, 2013

In order to avoid the runtime condition and variable load turn
sched_clock_stable into a static_key.

Also provide a shorter implementation of local_clock() and
cpu_clock(int) when sched_clock_stable==1.

                        MAINLINE   PRE       POST

    sched_clock_stable: 1          1         1
    (cold) sched_clock: 329841     221876    215295
    (cold) local_clock: 301773     234692    220773
    (warm) sched_clock: 38375      25602     25659
    (warm) local_clock: 100371     33265     27242
    (warm) rdtsc:       27340      24214     24208
    sched_clock_stable: 0          0         0
    (cold) sched_clock: 382634     235941    237019
    (cold) local_clock: 396890     297017    294819
    (warm) sched_clock: 38194      25233     25609
    (warm) local_clock: 143452     71234     71232
    (warm) rdtsc:       27345      24245     24243
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-eummbdechzz37mwmpags1gjr@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

35af99e6

sched/preempt: Take away preempt_enable_no_resched() from modules · 62b94a08

由 Peter Zijlstra 提交于 11月 20, 2013

Discourage drivers/modules to be creative with preemption.

Sadly all is implemented in macros and inline so if they want to do
evil they still can, but at least try and discourage some.
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Cc: rui.zhang@intel.com
Cc: jacob.jun.pan@linux.intel.com
Cc: Mike Galbraith <bitbucket@online.de>
Cc: hpa@zytor.com
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: lenb@kernel.org
Cc: rjw@rjwysocki.net
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-fn7h6vu8wtgxk0ih402qcijx@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

62b94a08

locking: Optimize lock_bh functions · 9ea4c380

由 Peter Zijlstra 提交于 11月 19, 2013

Currently all _bh_ lock functions do two preempt_count operations:

  local_bh_disable();
  preempt_disable();

and for the unlock:

  preempt_enable_no_resched();
  local_bh_enable();

Since its a waste of perfectly good cycles to modify the same variable
twice when you can do it in one go; use the new
__local_bh_{dis,en}able_ip() functions that allow us to provide a
preempt_count value to add/sub.

So define SOFTIRQ_LOCK_OFFSET as the offset a _bh_ lock needs to
add/sub to be done in one go.

As a bonus it gets rid of the preempt_enable_no_resched() usage.

This reduces a 1000 loops of:

  spin_lock_bh(&bh_lock);
  spin_unlock_bh(&bh_lock);

from 53596 cycles to 51995 cycles. I didn't do enough measurements to
say for absolute sure that the result is significant but the the few
runs I did for each suggest it is so.
Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: jacob.jun.pan@linux.intel.com
Cc: Mike Galbraith <bitbucket@online.de>
Cc: hpa@zytor.com
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: lenb@kernel.org
Cc: rjw@rjwysocki.net
Cc: rui.zhang@intel.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131119151338.GF3694@twins.programming.kicks-ass.netSigned-off-by: NIngo Molnar <mingo@kernel.org>

9ea4c380

sched/deadline: Remove the sysctl_sched_dl knobs · 1724813d

由 Peter Zijlstra 提交于 12月 17, 2013

Remove the deadline specific sysctls for now. The problem with them is
that the interaction with the exisiting rt knobs is nearly impossible
to get right.

The current (as per before this patch) situation is that the rt and dl
bandwidth is completely separate and we enforce rt+dl < 100%. This is
undesirable because this means that the rt default of 95% leaves us
hardly any room, even though dl tasks are saver than rt tasks.

Another proposed solution was (a discarted patch) to have the dl
bandwidth be a fraction of the rt bandwidth. This is highly
confusing imo.

Furthermore neither proposal is consistent with the situation we
actually want; which is rt tasks ran from a dl server. In which case
the rt bandwidth is a direct subset of dl.

So whichever way we go, the introduction of dl controls at this point
is painful. Therefore remove them and instead share the rt budget.

This means that for now the rt knobs are used for dl admission control
and the dl runtime is accounted against the rt runtime. I realise that
this isn't entirely desirable either; but whatever we do we appear to
need to change the interface later, so better have a small interface
for now.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-zpyqbqds1r0vyxtxza1e7rdc@git.kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

1724813d

sched/deadline: Add bandwidth management for SCHED_DEADLINE tasks · 332ac17e

由 Dario Faggioli 提交于 11月 07, 2013

In order of deadline scheduling to be effective and useful, it is
important that some method of having the allocation of the available
CPU bandwidth to tasks and task groups under control.
This is usually called "admission control" and if it is not performed
at all, no guarantee can be given on the actual scheduling of the
-deadline tasks.

Since when RT-throttling has been introduced each task group have a
bandwidth associated to itself, calculated as a certain amount of
runtime over a period. Moreover, to make it possible to manipulate
such bandwidth, readable/writable controls have been added to both
procfs (for system wide settings) and cgroupfs (for per-group
settings).

Therefore, the same interface is being used for controlling the
bandwidth distrubution to -deadline tasks and task groups, i.e.,
new controls but with similar names, equivalent meaning and with
the same usage paradigm are added.

However, more discussion is needed in order to figure out how
we want to manage SCHED_DEADLINE bandwidth at the task group level.
Therefore, this patch adds a less sophisticated, but actually
very sensible, mechanism to ensure that a certain utilization
cap is not overcome per each root_domain (the single rq for !SMP
configurations).

Another main difference between deadline bandwidth management and
RT-throttling is that -deadline tasks have bandwidth on their own
(while -rt ones doesn't!), and thus we don't need an higher level
throttling mechanism to enforce the desired bandwidth.

This patch, therefore:

 - adds system wide deadline bandwidth management by means of:
    * /proc/sys/kernel/sched_dl_runtime_us,
    * /proc/sys/kernel/sched_dl_period_us,
   that determine (i.e., runtime / period) the total bandwidth
   available on each CPU of each root_domain for -deadline tasks;

 - couples the RT and deadline bandwidth management, i.e., enforces
   that the sum of how much bandwidth is being devoted to -rt
   -deadline tasks to stay below 100%.

This means that, for a root_domain comprising M CPUs, -deadline tasks
can be created until the sum of their bandwidths stay below:

    M * (sched_dl_runtime_us / sched_dl_period_us)

It is also possible to disable this bandwidth management logic, and
be thus free of oversubscribing the system up to any arbitrary level.
Signed-off-by: NDario Faggioli <raistlin@linux.it>
Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1383831828-15501-12-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

332ac17e

sched/deadline: Add SCHED_DEADLINE inheritance logic · 2d3d891d

由 Dario Faggioli 提交于 11月 07, 2013

Some method to deal with rt-mutexes and make sched_dl interact with
the current PI-coded is needed, raising all but trivial issues, that
needs (according to us) to be solved with some restructuring of
the pi-code (i.e., going toward a proxy execution-ish implementation).

This is under development, in the meanwhile, as a temporary solution,
what this commits does is:

 - ensure a pi-lock owner with waiters is never throttled down. Instead,
   when it runs out of runtime, it immediately gets replenished and it's
   deadline is postponed;

 - the scheduling parameters (relative deadline and default runtime)
   used for that replenishments --during the whole period it holds the
   pi-lock-- are the ones of the waiting task with earliest deadline.

Acting this way, we provide some kind of boosting to the lock-owner,
still by using the existing (actually, slightly modified by the previous
commit) pi-architecture.

We would stress the fact that this is only a surely needed, all but
clean solution to the problem. In the end it's only a way to re-start
discussion within the community. So, as always, comments, ideas, rants,
etc.. are welcome! :-)
Signed-off-by: NDario Faggioli <raistlin@linux.it>
Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
[ Added !RT_MUTEXES build fix. ]
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1383831828-15501-11-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

2d3d891d

rtmutex: Turn the plist into an rb-tree · fb00aca4

由 Peter Zijlstra 提交于 11月 07, 2013

Turn the pi-chains from plist to rb-tree, in the rt_mutex code,
and provide a proper comparison function for -deadline and
-priority tasks.

This is done mainly because:
 - classical prio field of the plist is just an int, which might
   not be enough for representing a deadline;
 - manipulating such a list would become O(nr_deadline_tasks),
   which might be to much, as the number of -deadline task increases.

Therefore, an rb-tree is used, and tasks are queued in it according
to the following logic:
 - among two -priority (i.e., SCHED_BATCH/OTHER/RR/FIFO) tasks, the
   one with the higher (lower, actually!) prio wins;
 - among a -priority and a -deadline task, the latter always wins;
 - among two -deadline tasks, the one with the earliest deadline
   wins.

Queueing and dequeueing functions are changed accordingly, for both
the list of a task's pi-waiters and the list of tasks blocked on
a pi-lock.
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Signed-off-by: NDario Faggioli <raistlin@linux.it>
Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
Signed-off-again-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1383831828-15501-10-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

fb00aca4

sched/deadline: Add period support for SCHED_DEADLINE tasks · 755378a4

由 Harald Gustafsson 提交于 11月 07, 2013

Make it possible to specify a period (different or equal than
deadline) for -deadline tasks. Relative deadlines (D_i) are used on
task arrivals to generate new scheduling (absolute) deadlines as "d =
t + D_i", and periods (P_i) to postpone the scheduling deadlines as "d
= d + P_i" when the budget is zero.

This is in general useful to model (and schedule) tasks that have slow
activation rates (long periods), but have to be scheduled soon once
activated (short deadlines).
Signed-off-by: NHarald Gustafsson <harald.gustafsson@ericsson.com>
Signed-off-by: NDario Faggioli <raistlin@linux.it>
Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1383831828-15501-7-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

755378a4

sched/deadline: Add SCHED_DEADLINE SMP-related data structures & logic · 1baca4ce

由 Juri Lelli 提交于 11月 07, 2013

Introduces data structures relevant for implementing dynamic
migration of -deadline tasks and the logic for checking if
runqueues are overloaded with -deadline tasks and for choosing
where a task should migrate, when it is the case.

Adds also dynamic migrations to SCHED_DEADLINE, so that tasks can
be moved among CPUs when necessary. It is also possible to bind a
task to a (set of) CPU(s), thus restricting its capability of
migrating, or forbidding migrations at all.

The very same approach used in sched_rt is utilised:
- -deadline tasks are kept into CPU-specific runqueues,
- -deadline tasks are migrated among runqueues to achieve the
following:
* on an M-CPU system the M earliest deadline ready tasks
are always running;
* affinity/cpusets settings of all the -deadline tasks is
always respected.

Therefore, this very special form of "load balancing" is done with
an active method, i.e., the scheduler pushes or pulls tasks between
runqueues when they are woken up and/or (de)scheduled.
IOW, every time a preemption occurs, the descheduled task might be sent
to some other CPU (depending on its deadline) to continue executing
(push). On the other hand, every time a CPU becomes idle, it might pull
the second earliest deadline ready task from some other CPU.

To enforce this, a pull operation is always attempted before taking any
scheduling decision (pre_schedule()), as well as a push one after each
scheduling decision (post_schedule()). In addition, when a task arrives
or wakes up, the best CPU where to resume it is selected taking into
account its affinity mask, the system topology, but also its deadline.
E.g., from the scheduling point of view, the best CPU where to wake
up (and also where to push) a task is the one which is running the task
with the latest deadline among the M executing ones.

In order to facilitate these decisions, per-runqueue "caching" of the
deadlines of the currently running and of the first ready task is used.
Queued but not running tasks are also parked in another rb-tree to
speed-up pushes.
Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
Signed-off-by: NDario Faggioli <raistlin@linux.it>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1383831828-15501-5-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

1baca4ce

sched/deadline: Add SCHED_DEADLINE structures & implementation · aab03e05

由 Dario Faggioli 提交于 11月 28, 2013

Introduces the data structures, constants and symbols needed for
SCHED_DEADLINE implementation.

Core data structure of SCHED_DEADLINE are defined, along with their
initializers. Hooks for checking if a task belong to the new policy
are also added where they are needed.

Adds a scheduling class, in sched/dl.c and a new policy called
SCHED_DEADLINE. It is an implementation of the Earliest Deadline
First (EDF) scheduling algorithm, augmented with a mechanism (called
Constant Bandwidth Server, CBS) that makes it possible to isolate
the behaviour of tasks between each other.

The typical -deadline task will be made up of a computation phase
(instance) which is activated on a periodic or sporadic fashion. The
expected (maximum) duration of such computation is called the task's
runtime; the time interval by which each instance need to be completed
is called the task's relative deadline. The task's absolute deadline
is dynamically calculated as the time instant a task (better, an
instance) activates plus the relative deadline.

The EDF algorithms selects the task with the smallest absolute
deadline as the one to be executed first, while the CBS ensures each
task to run for at most its runtime every (relative) deadline
length time interval, avoiding any interference between different
tasks (bandwidth isolation).
Thanks to this feature, also tasks that do not strictly comply with
the computational model sketched above can effectively use the new
policy.

To summarize, this patch:
 - introduces the data structures, constants and symbols needed;
 - implements the core logic of the scheduling algorithm in the new
   scheduling class file;
 - provides all the glue code between the new scheduling class and
   the core scheduler and refines the interactions between sched/dl
   and the other existing scheduling classes.
Signed-off-by: NDario Faggioli <raistlin@linux.it>
Signed-off-by: NMichael Trimarchi <michael@amarulasolutions.com>
Signed-off-by: NFabio Checconi <fchecconi@gmail.com>
Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1383831828-15501-4-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

aab03e05

sched: Add new scheduler syscalls to support an extended scheduling parameters ABI · d50dde5a

由 Dario Faggioli 提交于 11月 07, 2013

Add the syscalls needed for supporting scheduling algorithms
with extended scheduling parameters (e.g., SCHED_DEADLINE).

In general, it makes possible to specify a periodic/sporadic task,
that executes for a given amount of runtime at each instance, and is
scheduled according to the urgency of their own timing constraints,
i.e.:

- a (maximum/typical) instance execution time,
- a minimum interval between consecutive instances,
- a time constraint by which each instance must be completed.

Thus, both the data structure that holds the scheduling parameters of
the tasks and the system calls dealing with it must be extended.
Unfortunately, modifying the existing struct sched_param would break
the ABI and result in potentially serious compatibility issues with
legacy binaries.

For these reasons, this patch:

- defines the new struct sched_attr, containing all the fields
that are necessary for specifying a task in the computational
model described above;

- defines and implements the new scheduling related syscalls that
manipulate it, i.e., sched_setattr() and sched_getattr().

Syscalls are introduced for x86 (32 and 64 bits) and ARM only, as a
proof of concept and for developing and testing purposes. Making them
available on other architectures is straightforward.

Since no "user" for these new parameters is introduced in this patch,
the implementation of the new system calls is just identical to their
already existing counterpart. Future patches that implement scheduling
policies able to exploit the new data structure must also take care of
modifying the sched_*attr() calls accordingly with their own purposes.
Signed-off-by: NDario Faggioli <raistlin@linux.it>
[ Rewrote to use sched_attr. ]
Signed-off-by: NJuri Lelli <juri.lelli@gmail.com>
[ Removed sched_setscheduler2() for now. ]
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1383831828-15501-3-git-send-email-juri.lelli@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

d50dde5a

12 1月, 2014 3 次提交

arch: Introduce smp_load_acquire(), smp_store_release() · 47933ad4

由 Peter Zijlstra 提交于 11月 06, 2013

A number of situations currently require the heavyweight smp_mb(),
even though there is no need to order prior stores against later
loads.  Many architectures have much cheaper ways to handle these
situations, but the Linux kernel currently has no portable way
to make use of them.

This commit therefore supplies smp_load_acquire() and
smp_store_release() to remedy this situation.  The new
smp_load_acquire() primitive orders the specified load against
any subsequent reads or writes, while the new smp_store_release()
primitive orders the specifed store against any prior reads or
writes.  These primitives allow array-based circular FIFOs to be
implemented without an smp_mb(), and also allow a theoretical
hole in rcu_assign_pointer() to be closed at no additional
expense on most architectures.

In addition, the RCU experience transitioning from explicit
smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
and rcu_assign_pointer(), respectively resulted in substantial
improvements in readability.  It therefore seems likely that
replacing other explicit barriers with smp_load_acquire() and
smp_store_release() will provide similar benefits.  It appears
that roughly half of the explicit barriers in core kernel code
might be so replaced.

[Changelog by PaulMck]
Reviewed-by: N"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Acked-by: NWill Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

47933ad4

perf/x86: Fix active_entry initialization · f3ae75de

由 Stephane Eranian 提交于 1月 08, 2014

This patch fixes a problem with the initialization of the
struct perf_event active_entry field. It is defined inside
an anonymous union and was initialized in perf_event_alloc()
using INIT_LIST_HEAD(). However at that time, we do not know
whether the event is going to use active_entry or hlist_entry (SW).
Or at last, we don't want to make that determination there.
The problem is that hlist and list_head are not initialized
the same way. One is okay with NULL (from kzmalloc), the other
needs to pointers to point to self.

This patch resolves this problem by dropping the union.
This will avoid problems later on, if someone starts using
active_entry or hlist_entry without verifying that they
actually overlap. This also solves the initialization
problem.
Signed-off-by: NStephane Eranian <eranian@google.com>
Cc: ak@linux.intel.com
Cc: acme@redhat.com
Cc: jolsa@redhat.com
Cc: zheng.z.yan@intel.com
Cc: bp@alien8.de
Cc: vincent.weaver@maine.edu
Cc: maria.n.dimakopoulou@gmail.com
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1389176153-3128-2-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>

f3ae75de

seqlock: Use raw_ prefix instead of _no_lockdep · 0c3351d4

由 John Stultz 提交于 1月 02, 2014

Linus disliked the _no_lockdep() naming, so instead
use the more-consistent raw_* prefix to the non-lockdep
enabled seqcount methods.

This also adds raw_ methods for the write operations
as well, which will be utilized in a following patch.
Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: NStephen Boyd <sboyd@codeaurora.org>
Signed-off-by: NJohn Stultz <john.stultz@linaro.org>
Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
Cc: Krzysztof Hałasa <khalasa@piap.pl>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Willy Tarreau <w@1wt.eu>
Link: http://lkml.kernel.org/r/1388704274-5278-1-git-send-email-john.stultz@linaro.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>

0c3351d4

11 1月, 2014 8 次提交

usb: core: allow a reference device for new_id · 2fc82c2d

由 Wolfram Sang 提交于 1月 10, 2014

Often, usb drivers need some driver_info to get a device to work. To
have access to driver_info when using new_id, allow to pass a reference
vendor:product tuple from which new_id will inherit driver_info.
Signed-off-by: NWolfram Sang <wsa@the-dreams.de>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

2fc82c2d

drivers/base: provide an infrastructure for componentised subsystems · 2a41e607

由 Russell King 提交于 1月 10, 2014

Subsystems such as ALSA, DRM and others require a single card-level
device structure to represent a subsystem.  However, firmware tends to
describe the individual devices and the connections between them.

Therefore, we need a way to gather up the individual component devices
together, and indicate when we have all the component devices.

We do this in DT by providing a "superdevice" node which specifies
the components, eg:

	imx-drm {
		compatible = "fsl,drm";
		crtcs = <&ipu1>;
		connectors = <&hdmi>;
	};

The superdevice is declared into the component support, along with the
subcomponents.  The superdevice receives callbacks to locate the
subcomponents, and identify when all components are present.  At this
point, we bind the superdevice, which causes the appropriate subsystem
to be initialised in the conventional way.

When any of the components or superdevice are removed from the system,
we unbind the superdevice, thereby taking the subsystem down.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

2a41e607

sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner() · d1ba277e

由 Tejun Heo 提交于 1月 10, 2014

All device_schedule_callback_owner() users are converted to use
device_remove_file_self().  Remove now unused
{sysfs|device}_schedule_callback_owner().
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

d1ba277e

kernfs, sysfs, driver-core: implement kernfs_remove_self() and its wrappers · 1ae06819

由 Tejun Heo 提交于 1月 10, 2014

Sometimes it's necessary to implement a node which wants to delete
nodes including itself.  This isn't straightforward because of kernfs
active reference.  While a file operation is in progress, an active
reference is held and kernfs_remove() waits for all such references to
drain before completing.  For a self-deleting node, this is a deadlock
as kernfs_remove() ends up waiting for an active reference that itself
is sitting on top of.

This currently is worked around in the sysfs layer using
sysfs_schedule_callback() which makes such removals asynchronous.
While it works, it's rather cumbersome and inherently breaks
synchronicity of the operation - the file operation which triggered
the operation may complete before the removal is finished (or even
started) and the removal may fail asynchronously.  If a removal
operation is immmediately followed by another operation which expects
the specific name to be available (e.g. removal followed by rename
onto the same name), there's no way to make the latter operation
reliable.

The thing is there's no inherent reason for this to be asynchrnous.
All that's necessary to do this synchronous is a dedicated operation
which drops its own active ref and deactivates self.  This patch
implements kernfs_remove_self() and its wrappers in sysfs and driver
core.  kernfs_remove_self() is to be called from one of the file
operations, drops the active ref and deactivates using
__kernfs_deactivate_self(), removes the self node, and restores active
ref to the dead node using __kernfs_reactivate_self() so that the ref
is balanced afterwards.  __kernfs_remove() is updated so that it takes
an early exit if the target node is already fully removed so that the
active ref restored by kernfs_remove_self() after removal doesn't
confuse the deactivation path.

This makes implementing self-deleting nodes very easy.  The normal
removal path doesn't even need to be changed to use
kernfs_remove_self() for the self-deleting node.  The method can
invoke kernfs_remove_self() on itself before proceeding the normal
removal path.  kernfs_remove() invoked on the node by the normal
deletion path will simply be ignored.

This will replace sysfs_schedule_callback().  A subtle feature of
sysfs_schedule_callback() is that it collapses multiple invocations -
even if multiple removals are triggered, the removal callback is run
only once.  An equivalent effect can be achieved by testing the return
value of kernfs_remove_self() - only the one which gets %true return
value should proceed with actual deletion.  All other instances of
kernfs_remove_self() will wait till the enclosing kernfs operation
which invoked the winning instance of kernfs_remove_self() finishes
and then return %false.  This trivially makes all users of
kernfs_remove_self() automatically show correct synchronous behavior
even when there are multiple concurrent operations - all "echo 1 >
delete" instances will finish only after the whole operation is
completed by one of the instances.

v2: For !CONFIG_SYSFS, dummy version kernfs_remove_self() was missing
    and sysfs_remove_file_self() had incorrect return type.  Fix it.
    Reported by kbuild test bot.

v3: Updated to use __kernfs_{de|re}activate_self().
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

1ae06819

kernfs: implement kernfs_{de|re}activate[_self]() · 9f010c2a

由 Tejun Heo 提交于 1月 10, 2014

This patch implements four functions to manipulate deactivation state
- deactivate, reactivate and the _self suffixed pair.  A new fields
kernfs_node->deact_depth is added so that concurrent and nested
deactivations are handled properly.  kernfs_node->hash is moved so
that it's paired with the new field so that it doesn't increase the
size of kernfs_node.

A kernfs user's lock would normally nest inside active ref but during
removal the user may want to perform kernfs_remove() while holding the
said lock, which would introduce a reverse locking dependency.  This
function can be used to break such reverse dependency by allowing
deactivation step to performed separately outside user's critical
section.

This will also be used implement kernfs_remove_self().
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

9f010c2a

kernfs: remove kernfs_addrm_cxt · 99177a34

由 Tejun Heo 提交于 1月 10, 2014

kernfs_addrm_cxt and the accompanying kernfs_addrm_start/finish() were
added because there were operations which should be performed outside
kernfs_mutex after adding and removing kernfs_nodes.  The necessary
operations were recorded in kernfs_addrm_cxt and performed by
kernfs_addrm_finish(); however, after the recent changes which
relocated deactivation and unmapping so that they're performed
directly during removal, the only operation kernfs_addrm_finish()
performs is kernfs_put(), which can be moved inside the removal path
too.

This patch moves the kernfs_put() of the base ref to __kernfs_remove()
and remove kernfs_addrm_cxt and kernfs_addrm_start/finish().

* kernfs_add_one() is updated to grab and release the parent's active
  ref and kernfs_mutex itself.  kernfs_get/put_active() and
  kernfs_addrm_start/finish() invocations around it are removed from
  all users.

* __kernfs_remove() puts an unlinked node directly instead of chaining
  it to kernfs_addrm_cxt.  Its callers are updated to grab and release
  kernfs_mutex instead of calling kernfs_addrm_start/finish() around
  it.

v2: Updated to fit the v2 restructuring of removal path.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

99177a34

kernfs: restructure removal path to fix possible premature return · 45a140e5

由 Tejun Heo 提交于 1月 10, 2014

The recursive nature of kernfs_remove() means that, even if
kernfs_remove() is not allowed to be called multiple times on the same
node, there may be race conditions between removal of parent and its
descendants. While we can claim that kernfs_remove() shouldn't be
called on one of the descendants while the removal of an ancestor is
in progress, such rule is unnecessarily restrictive and very difficult
to enforce. It's better to simply allow invoking kernfs_remove() as
the caller sees fit as long as the caller ensures that the node is
accessible.

The current behavior in such situations is broken. Whoever enters
removal path first takes the node off the hierarchy and then
deactivates. Following removers either return as soon as it notices
that it's not the first one or can't even find the target node as it
has already been removed from the hierarchy. In both cases, the
following removers may finish prematurely while the nodes which should
be removed and drained are still being processed by the first one.

This patch restructures so that multiple removers, whether through
recursion or direction invocation, always follow the following rules.

* When there are multiple concurrent removers, only one puts the base
ref.

* Regardless of which one puts the base ref, all removers are blocked
until the target node is fully deactivated and removed.

To achieve the above, removal path now first deactivates the subtree,
drains it and then unlinks one-by-one. __kernfs_deactivate() is
called directly from __kernfs_removal() and drops and regrabs
kernfs_mutex for each descendant to drain active refs. As this means
that multiple removers can enter __kernfs_deactivate() for the same
node, the function is updated so that it can handle multiple
deactivators of the same node - only one actually deactivates but all
wait till drain completion.

The restructured removal path guarantees that a removed node gets
unlinked only after the node is deactivated and drained. Combined
with proper multiple deactivator handling, this guarantees that any
invocation of kernfs_remove() returns only after the node itself and
all its descendants are deactivated, drained and removed.

v2: Draining separated into a separate loop (used to be in the same
loop as unlink) and done from __kernfs_deactivate(). This is to
allow exposing deactivation as a separate interface later.

Root node removal was broken in v1 patch. Fixed.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

45a140e5

kernfs: remove KERNFS_REMOVED · ae34372e

由 Tejun Heo 提交于 1月 10, 2014

KERNFS_REMOVED is used to mark half-initialized and dying nodes so
that they don't show up in lookups and deny adding new nodes under or
renaming it; however, its role overlaps those of deactivation and
removal from rbtree.

It's necessary to deny addition of new children while removal is in
progress; however, this role considerably intersects with deactivation
- KERNFS_REMOVED prevents new children while deactivation prevents new
file operations.  There's no reason to have them separate making
things more complex than necessary.

KERNFS_REMOVED is also used to decide whether a node is still visible
to vfs layer, which is rather redundant as equivalent determination
can be made by testing whether the node is on its parent's children
rbtree or not.

This patch removes KERNFS_REMOVED.

* Instead of KERNFS_REMOVED, each node now starts its life
  deactivated.  This means that we now use both atomic_add() and
  atomic_sub() on KN_DEACTIVATED_BIAS, which is INT_MIN.  The compiler
  generates an overflow warnings when negating INT_MIN as the negation
  can't be represented as a positive number.  Nothing is actually
  broken but let's bump BIAS by one to avoid the warnings for archs
  which negates the subtrahend..

* KERNFS_REMOVED tests in add and rename paths are replaced with
  kernfs_get/put_active() of the target nodes.  Due to the way the add
  path is structured now, active ref handling is done in the callers
  of kernfs_add_one().  This will be consolidated up later.

* kernfs_remove_one() is updated to deactivate instead of setting
  KERNFS_REMOVED.  This removes deactivation from kernfs_deactivate(),
  which is now renamed to kernfs_drain().

* kernfs_dop_revalidate() now tests RB_EMPTY_NODE(&kn->rb) instead of
  KERNFS_REMOVED and KERNFS_REMOVED test in kernfs_dir_pos() is
  dropped.  A node which is removed from the children rbtree is not
  included in the iteration in the first place.  This means that a
  node may be visible through vfs a bit longer - it's now also visible
  after deactivation until the actual removal.  This slightly enlarged
  window difference doesn't make any difference to the userland.

* Sanity check on KERNFS_REMOVED in kernfs_put() is replaced with
  checks on the active ref.

* Some comment style updates in the affected area.

v2: Reordered before removal path restructuring.  kernfs_active()
    dropped and kernfs_get/put_active() used instead.  RB_EMPTY_NODE()
    used in the lookup paths.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

ae34372e