提交 · d033e078566faed8c8f59baf97ee57ce2524ef5c · OpenHarmony / kernel_linux

22 10月, 2011 1 次提交

PM / Sleep: Mark devices involved in wakeup signaling during suspend · 4ca46ff3

由 Rafael J. Wysocki 提交于 10月 16, 2011

The generic PM domains code in drivers/base/power/domain.c has
to avoid powering off domains that provide power to wakeup devices
during system suspend.  Currently, however, this only works for
wakeup devices directly belonging to the given domain and not for
their children (or the children of their children and so on).
Thus, if there's a wakeup device whose parent belongs to a power
domain handled by the generic PM domains code, the domain will be
powered off during system suspend preventing the device from
signaling wakeup.

To address this problem introduce a device flag, power.wakeup_path,
that will be set during system suspend for all wakeup devices,
their parents, the parents of their parents and so on.  This way,
all wakeup paths in the device hierarchy will be marked and the
generic PM domains code will only need to avoid powering off
domains containing devices whose power.wakeup_path is set.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

4ca46ff3

17 10月, 2011 4 次提交

PM / Hibernate: Freeze kernel threads after preallocating memory · 2aede851

由 Rafael J. Wysocki 提交于 9月 26, 2011

There is a problem with the current ordering of hibernate code which
leads to deadlocks in some filesystems' memory shrinkers.  Namely,
some filesystems use freezable kernel threads that are inactive when
the hibernate memory preallocation is carried out.  Those same
filesystems use memory shrinkers that may be triggered by the
hibernate memory preallocation.  If those memory shrinkers wait for
the frozen kernel threads, the hibernate process deadlocks (this
happens with XFS, for one example).

Apparently, it is not technically viable to redesign the filesystems
in question to avoid the situation described above, so the only
possible solution of this issue is to defer the freezing of kernel
threads until the hibernate memory preallocation is done, which is
implemented by this change.

Unfortunately, this requires the memory preallocation to be done
before the "prepare" stage of device freeze, so after this change the
only way drivers can allocate additional memory for their freeze
routines in a clean way is to use PM notifiers.
Reported-by: NChristoph <cr2005@u-club.de>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

2aede851

PM / VT: Cleanup #if defined uglyness and fix compile error · 37cce26b

由 H Hartley Sweeten 提交于 9月 21, 2011

Introduce the config option CONFIG_VT_CONSOLE_SLEEP in order to cleanup
the #if defined ugliness for the vt suspend support functions. Note that
CONFIG_VT_CONSOLE is already dependant on CONFIG_VT.

The function pm_set_vt_switch is actually dependant on CONFIG_VT and not
CONFIG_PM_SLEEP. This fixes a compile error when CONFIG_PM_SLEEP is
not set:

drivers/tty/vt/vt_ioctl.c:1794: error: redefinition of 'pm_set_vt_switch'
include/linux/suspend.h:17: error: previous definition of 'pm_set_vt_switch' was here

Also, remove the incorrect path from the comment in console.c.

[rjw: Replaced #if defined() with #ifdef in suspend.h.]
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: NArnd Bergmann <arnd@arndb.de>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

37cce26b

PM / Hibernate: Include storage keys in hibernation image on s390 · 85055dd8

由 Martin Schwidefsky 提交于 8月 17, 2011

For s390 there is one additional byte associated with each page,
the storage key. This byte contains the referenced and changed
bits and needs to be included into the hibernation image.
If the storage keys are not restored to their previous state all
original pages would appear to be dirty. This can cause
inconsistencies e.g. with read-only filesystems.
Signed-off-by: NMartin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

85055dd8

PM / Suspend: Add statistics debugfs file for suspend to RAM · 2a77c46d

由 ShuoX Liu 提交于 8月 10, 2011

Record S3 failure time about each reason and the latest two failed
devices' names in S3 progress.
We can check it through 'suspend_stats' entry in debugfs.

The motivation of the patch:

We are enabling power features on Medfield. Comparing with PC/notebook,
a mobile enters/exits suspend-2-ram (we call it s3 on Medfield) far
more frequently. If it can't enter suspend-2-ram in time, the power
might be used up soon.

We often find sometimes, a device suspend fails. Then, system retries
s3 over and over again. As display is off, testers and developers
don't know what happens.

Some testers and developers complain they don't know if system
tries suspend-2-ram, and what device fails to suspend. They need
such info for a quick check. The patch adds suspend_stats under
debugfs for users to check suspend to RAM statistics quickly.

If not using this patch, we have other methods to get info about
what device fails. One is to turn on  CONFIG_PM_DEBUG, but users
would get too much info and testers need recompile the system.

In addition, dynamic debug is another good tool to dump debug info.
But it still doesn't match our utilization scenario closely.
1) user need write a user space parser to process the syslog output;
2) Our testing scenario is we leave the mobile for at least hours.
   Then, check its status. No serial console available during the
   testing. One is because console would be suspended, and the other
   is serial console connecting with spi or HSU devices would consume
   power. These devices are powered off at suspend-2-ram.
Signed-off-by: NShuoX Liu <shuox.liu@intel.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

2a77c46d

05 10月, 2011 2 次提交

PM / QoS: Add function dev_pm_qos_read_value() (v3) · 1a9a9152

由 Rafael J. Wysocki 提交于 9月 29, 2011

To read the current PM QoS value for a given device we need to
make sure that the device's power.constraints object won't be
removed while we're doing that.  For this reason, put the
operation under dev->power.lock and acquire the lock
around the initialization and removal of power.constraints.

Moreover, since we're using the value of power.constraints to
determine whether or not the object is present, the
power.constraints_state field isn't necessary any more and may be
removed.  However, dev_pm_qos_add_request() needs to check if the
device is being removed from the system before allocating a new
PM QoS constraints object for it, so make it use the
power.power_state field of struct device for this purpose.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

1a9a9152

PCI: Disable MPS configuration by default · 5f39e670

由 Jon Mason 提交于 10月 03, 2011

Add the ability to disable PCI-E MPS turning and using the BIOS
configured MPS defaults. Due to the number of issues recently
discovered on some x86 chipsets, make this the default behavior.

Also, add the option for peer to peer DMA MPS configuration. Peer to
peer DMA is outside the scope of this patch, but MPS configuration could
prevent it from working by having the MPS on one root port different
than the MPS on another. To work around this, simply make the system
wide MPS the smallest possible value (128B).
Signed-off-by: NJon Mason <mason@myri.com>
Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5f39e670

02 10月, 2011 2 次提交

PM / devfreq: Add basic governors · ce26c5bb

由 MyungJoo Ham 提交于 10月 02, 2011

Four cpufreq-like governors are provided as examples.

powersave: use the lowest frequency possible. The user (device) should
set the polling_ms as 0 because polling is useless for this governor.

performance: use the highest freqeuncy possible. The user (device)
should set the polling_ms as 0 because polling is useless for this
governor.

userspace: use the user specified frequency stored at
devfreq.user_set_freq. With sysfs support in the following patch, a user
may set the value with the sysfs interface.

simple_ondemand: simplified version of cpufreq's ondemand governor.

When a user updates OPP entries (enable/disable/add), OPP framework
automatically notifies devfreq to update operating frequency
accordingly. Thus, devfreq users (device drivers) do not need to update
devfreq manually with OPP entry updates or set polling_ms for powersave
, performance, userspace, or any other "static" governors.

Note that these are given only as basic examples for governors and any
devices with devfreq may implement their own governors with the drivers
and use them.
Signed-off-by: NMyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: NMike Turquette <mturquette@ti.com>
Acked-by: NKevin Hilman <khilman@ti.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

ce26c5bb

PM: Introduce devfreq: generic DVFS framework with device-specific OPPs · a3c98b8b

由 MyungJoo Ham 提交于 10月 02, 2011

With OPPs, a device may have multiple operable frequency and voltage
sets. However, there can be multiple possible operable sets and a system
will need to choose one from them. In order to reduce the power
consumption (by reducing frequency and voltage) without affecting the
performance too much, a Dynamic Voltage and Frequency Scaling (DVFS)
scheme may be used.

This patch introduces the DVFS capability to non-CPU devices with OPPs.
DVFS is a techique whereby the frequency and supplied voltage of a
device is adjusted on-the-fly. DVFS usually sets the frequency as low
as possible with given conditions (such as QoS assurance) and adjusts
voltage according to the chosen frequency in order to reduce power
consumption and heat dissipation.

The generic DVFS for devices, devfreq, may appear quite similar with
/drivers/cpufreq.  However, cpufreq does not allow to have multiple
devices registered and is not suitable to have multiple heterogenous
devices with different (but simple) governors.

Normally, DVFS mechanism controls frequency based on the demand for
the device, and then, chooses voltage based on the chosen frequency.
devfreq also controls the frequency based on the governor's frequency
recommendation and let OPP pick up the pair of frequency and voltage
based on the recommended frequency. Then, the chosen OPP is passed to
device driver's "target" callback.

When PM QoS is going to be used with the devfreq device, the device
driver should enable OPPs that are appropriate with the current PM QoS
requests. In order to do so, the device driver may call opp_enable and
opp_disable at the notifier callback of PM QoS so that PM QoS's
update_target() call enables the appropriate OPPs. Note that at least
one of OPPs should be enabled at any time; be careful when there is a
transition.
Signed-off-by: NMyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: NMike Turquette <mturquette@ti.com>
Acked-by: NKevin Hilman <khilman@ti.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

a3c98b8b

01 10月, 2011 1 次提交

PM / OPP: Add OPP availability change notifier. · 03ca370f

由 MyungJoo Ham 提交于 9月 30, 2011

The patch enables to register notifier_block for an OPP-device in order
to get notified for any changes in the availability of OPPs of the
device. For example, if a new OPP is inserted or enable/disable status
of an OPP is changed, the notifier is executed.

This enables the usage of opp_add, opp_enable, and opp_disable to
directly take effect with any connected entities such as cpufreq or
devfreq.
Signed-off-by: NMyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: NKyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: NMike Turquette <mturquette@ti.com>
Reviewed-by: NKevin Hilman <khilman@ti.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

03ca370f

30 9月, 2011 1 次提交

posix-cpu-timers: Cure SMP wobbles · d670ec13

由 Peter Zijlstra 提交于 9月 01, 2011

David reported:

  Attached below is a watered-down version of rt/tst-cpuclock2.c from
  GLIBC.  Just build it with "gcc -o test test.c -lpthread -lrt" or
  similar.

  Run it several times, and you will see cases where the main thread
  will measure a process clock difference before and after the nanosleep
  which is smaller than the cpu-burner thread's individual thread clock
  difference.  This doesn't make any sense since the cpu-burner thread
  is part of the top-level process's thread group.

  I've reproduced this on both x86-64 and sparc64 (using both 32-bit and
  64-bit binaries).

  For example:

  [davem@boricha build-x86_64-linux]$ ./test
  process: before(0.001221967) after(0.498624371) diff(497402404)
  thread:  before(0.000081692) after(0.498316431) diff(498234739)
  self:    before(0.001223521) after(0.001240219) diff(16698)
  [davem@boricha build-x86_64-linux]$ 

  The diff of 'process' should always be >= the diff of 'thread'.

  I make sure to wrap the 'thread' clock measurements the most tightly
  around the nanosleep() call, and that the 'process' clock measurements
  are the outer-most ones.

  ---
  #include <unistd.h>
  #include <stdio.h>
  #include <stdlib.h>
  #include <time.h>
  #include <fcntl.h>
  #include <string.h>
  #include <errno.h>
  #include <pthread.h>

  static pthread_barrier_t barrier;

  static void *chew_cpu(void *arg)
  {
	  pthread_barrier_wait(&barrier);
	  while (1)
		  __asm__ __volatile__("" : : : "memory");
	  return NULL;
  }

  int main(void)
  {
	  clockid_t process_clock, my_thread_clock, th_clock;
	  struct timespec process_before, process_after;
	  struct timespec me_before, me_after;
	  struct timespec th_before, th_after;
	  struct timespec sleeptime;
	  unsigned long diff;
	  pthread_t th;
	  int err;

	  err = clock_getcpuclockid(0, &process_clock);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(pthread_self(), &my_thread_clock);
	  if (err)
		  return 1;

	  pthread_barrier_init(&barrier, NULL, 2);
	  err = pthread_create(&th, NULL, chew_cpu, NULL);
	  if (err)
		  return 1;

	  err = pthread_getcpuclockid(th, &th_clock);
	  if (err)
		  return 1;

	  pthread_barrier_wait(&barrier);

	  err = clock_gettime(process_clock, &process_before);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &me_before);
	  if (err)
		  return 1;

	  err = clock_gettime(th_clock, &th_before);
	  if (err)
		  return 1;

	  sleeptime.tv_sec = 0;
	  sleeptime.tv_nsec = 500000000;
	  nanosleep(&sleeptime, NULL);

	  err = clock_gettime(th_clock, &th_after);
	  if (err)
		  return 1;

	  err = clock_gettime(my_thread_clock, &me_after);
	  if (err)
		  return 1;

	  err = clock_gettime(process_clock, &process_after);
	  if (err)
		  return 1;

	  diff = process_after.tv_nsec - process_before.tv_nsec;
	  printf("process: before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 process_before.tv_sec, process_before.tv_nsec,
		 process_after.tv_sec, process_after.tv_nsec, diff);
	  diff = th_after.tv_nsec - th_before.tv_nsec;
	  printf("thread:  before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 th_before.tv_sec, th_before.tv_nsec,
		 th_after.tv_sec, th_after.tv_nsec, diff);
	  diff = me_after.tv_nsec - me_before.tv_nsec;
	  printf("self:    before(%lu.%.9lu) after(%lu.%.9lu) diff(%lu)\n",
		 me_before.tv_sec, me_before.tv_nsec,
		 me_after.tv_sec, me_after.tv_nsec, diff);

	  return 0;
  }

This is due to us using p->se.sum_exec_runtime in
thread_group_cputime() where we iterate the thread group and sum all
data. This does not take time since the last schedule operation (tick
or otherwise) into account. We can cure this by using
task_sched_runtime() at the cost of having to take locks.

This also means we can (and must) do away with
thread_group_sched_runtime() since the modified thread_group_cputime()
is now more accurate and would deadlock when called from
thread_group_sched_runtime().

Aside of that it makes the function safe on 32 bit systems. The old
code added t->se.sum_exec_runtime unprotected. sum_exec_runtime is a
64bit value and could be changed on another cpu at the same time.
Reported-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1314874459.7945.22.camel@twinsTested-by: NDavid Miller <davem@davemloft.net>
Signed-off-by: NThomas Gleixner <tglx@linutronix.de>

d670ec13

29 9月, 2011 1 次提交

ptp: fix L2 event message recognition · f75159e9

由 Richard Cochran 提交于 9月 20, 2011

The IEEE 1588 standard defines two kinds of messages, event and general
messages. Event messages require time stamping, and general do not. When
using UDP transport, two separate ports are used for the two message
types.

The BPF designed to recognize event messages incorrectly classifies L2
general messages as event messages. This commit fixes the issue by
extending the filter to check the message type field for L2 PTP packets.
Event messages are be distinguished from general messages by testing
the "general" bit.
Signed-off-by: NRichard Cochran <richard.cochran@omicron.at>
Cc: <stable@kernel.org>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

f75159e9

27 9月, 2011 3 次提交

vfs: remove LOOKUP_NO_AUTOMOUNT flag · b6c8069d

由 Linus Torvalds 提交于 9月 27, 2011

That flag no longer makes sense, since we don't look up automount points
as eagerly any more. Additionally, it turns out that the NO_AUTOMOUNT
handling was buggy to begin with: it would avoid automounting even for
cases where we really *needed* to do the automount handling, and could
return ENOENT for autofs entries that hadn't been instantiated yet.

With our new non-eager automount semantics, one discussion has been
about adding a AT_AUTOMOUNT flag to vfs_fstatat (and thus the
newfstatat() and fstatat64() system calls), but it's probably not worth
it: you can always force at least directory automounting by simply
adding the final '/' to the filename, which works for *all* of the stat
family system calls, old and new.

So AT_NO_AUTOMOUNT (and thus LOOKUP_NO_AUTOMOUNT) really were just a
result of our bad default behavior.
Acked-by: NIan Kent <raven@themaw.net>
Acked-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b6c8069d

vfs pathname lookup: Add LOOKUP_AUTOMOUNT flag · d94c177b

由 Linus Torvalds 提交于 9月 26, 2011

Since we've now turned around and made LOOKUP_FOLLOW *not* force an
automount, we want to add the ability to force an automount event on
lookup even if we don't happen to have one of the other flags that force
it implicitly (LOOKUP_OPEN, LOOKUP_DIRECTORY, LOOKUP_PARENT..)

Most cases will never want to use this, since you'd normally want to
delay automounting as long as possible, which usually implies
LOOKUP_OPEN (when we open a file or directory, we really cannot avoid
the automount any more).

But Trond argued sufficiently forcefully that at a minimum bind mounting
a file and quotactl will want to force the automount lookup.  Some other
cases (like nfs_follow_remote_path()) could use it too, although
LOOKUP_DIRECTORY would work there as well.

This commit just adds the flag and logic, no users yet, though.  It also
doesn't actually touch the LOOKUP_NO_AUTOMOUNT flag that is related, and
was made irrelevant by the same change that made us not follow on
LOOKUP_FOLLOW.

Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

d94c177b

PM / Domains: Split device PM domain data into base and need_restore · cd0ea672

由 Rafael J. Wysocki 提交于 9月 26, 2011

The struct pm_domain_data data type is defined in such a way that
adding new fields specific to the generic PM domains code will
require include/linux/pm.h to be modified. As a result, data types
used only by the generic PM domains code will be defined in two
headers, although they all should be defined in pm_domain.h and
pm.h will need to include more headers, which won't be very nice.

For this reason change the definition of struct pm_subsys_data
so that its domain_data member is a pointer, which will allow
struct pm_domain_data to be subclassed by various PM domains
implementations. Remove the need_restore member from
struct pm_domain_data and make the generic PM domains code
subclass it by adding the need_restore member to the new data type.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

cd0ea672

26 9月, 2011 1 次提交

dm crypt: always disable discard_zeroes_data · 983c7db3

由 Milan Broz 提交于 9月 25, 2011

If optional discard support in dm-crypt is enabled, discards requests
bypass the crypt queue and blocks of the underlying device are discarded.
For the read path, discarded blocks are handled the same as normal
ciphertext blocks, thus decrypted.

So if the underlying device announces discarded regions return zeroes,
dm-crypt must disable this flag because after decryption there is just
random noise instead of zeroes.
Signed-off-by: NMilan Broz <mbroz@redhat.com>
Signed-off-by: NAlasdair G Kergon <agk@redhat.com>

983c7db3

20 9月, 2011 2 次提交

[S390] kvm: extension capability for new address space layout · b6cf8788

由 Christian Borntraeger 提交于 9月 20, 2011

598841ca ([S390] use gmap address
spaces for kvm guest images) changed kvm on s390 to use a separate
address space for kvm guests. We can now put KVM guests anywhere
in the user address mode with a size up to 8PB - as long as the
memory is 1MB-aligned. This change was done without KVM extension
capability bit.
The change was added after 3.0, but we still have a chance to add
a feature bit before 3.1 (keeping the releases in a sane state).
We use number 71 to avoid collisions with other pending kvm patches
as requested by Alexander Graf.
Signed-off-by: NChristian Borntraeger <borntraeger@de.ibm.com>
Acked-by: NAvi Kivity <avi@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: NHeiko Carstens <heiko.carstens@de.ibm.com>

b6cf8788

irq: Add declaration of irq_domain_simple_ops to irqdomain.h · 5bd078dd

由 Rob Herring 提交于 9月 14, 2011

irq_domain_simple_ops is exported, but is not declared in irqdomain.h,
so add it.
Signed-off-by: NRob Herring <rob.herring@calxeda.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: marc.zyngier@arm.com
Cc: thomas.abraham@linaro.org
Cc: jamie@jamieiles.com
Cc: b-cousson@ti.com
Cc: shawn.guo@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: devicetree-discuss@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1316017900-19918-2-git-send-email-robherring2@gmail.comSigned-off-by: NThomas Gleixner <tglx@linutronix.de>

5bd078dd

16 9月, 2011 2 次提交

net: copy userspace buffers on device forwarding · 48c83012

由 Michael S. Tsirkin 提交于 8月 31, 2011

dev_forward_skb loops an skb back into host networking
stack which might hang on the memory indefinitely.
In particular, this can happen in macvtap in bridged mode.
Copy the userspace fragments to avoid blocking the
sender in that case.

As this patch makes skb_copy_ubufs extern now,
I also added some documentation and made it clear
the SKBTX_DEV_ZEROCOPY flag automatically instead
of doing it in all callers. This can be made into a separate
patch if people feel it's worth it.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

48c83012

tcp: Change possible SYN flooding messages · 946cedcc

由 Eric Dumazet 提交于 8月 30, 2011

"Possible SYN flooding on port xxxx " messages can fill logs on servers.

Change logic to log the message only once per listener, and add two new
SNMP counters to track :

TCPReqQFullDoCookies : number of times a SYNCOOKIE was replied to client

TCPReqQFullDrop : number of times a SYN request was dropped because
syncookies were not enabled.

Based on a prior patch from Tom Herbert, and suggestions from David.
Signed-off-by: NEric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

946cedcc

15 9月, 2011 2 次提交

drivers/gpio/gpio-generic.c: fix build errors · 4f5b0480

由 Russell King 提交于 9月 14, 2011

Building a kernel with hotplug disabled results in a link failure:

  `bgpio_remove' referenced in section `___ksymtab_gpl+bgpio_remove' of drivers/built-in.o: defined in discarded section `.devexit.text' of drivers/built-in.o

This is because of bgpio_remove() is exported.  It is illegal to export
symbols which are discarded either at link time or as part of an
init/exit section.

Fix this by dropping the __devexit attributation from bgpio_remove().
Also drop the __devinit attributation from bgpio_init().
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
Cc: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4f5b0480

memcg: Revert "memcg: add memory.vmscan_stat" · 185efc0f

由 Johannes Weiner 提交于 9月 14, 2011

Revert the post-3.0 commit 82f9d486 ("memcg: add
memory.vmscan_stat").

The implementation of per-memcg reclaim statistics violates how memcg
hierarchies usually behave: hierarchically.

The reclaim statistics are accounted to child memcgs and the parent
hitting the limit, but not to hierarchy levels in between.  Usually,
hierarchical statistics are perfectly recursive, with each level
representing the sum of itself and all its children.

Since this exports statistics to userspace, this may lead to confusion
and problems with changing things after the release, so revert it now,
we can try again later.
Signed-off-by: NJohannes Weiner <jweiner@redhat.com>
Acked-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Ying Han <yinghan@google.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

185efc0f

09 9月, 2011 1 次提交

regulator: fix kernel-doc warning in consumer.h · bff747c5

由 Randy Dunlap 提交于 9月 08, 2011

Fix kernel-doc warning about internal/private data by marking it
as "private:" so that kernel-doc will ignore it.

Warning(include/linux/regulator/consumer.h:128): No description found for parameter 'ret'
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Acked-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bff747c5

06 9月, 2011 1 次提交

mfd: Fix value of WM8994_CONFIGURE_GPIO · 8efcc57d

由 Mark Brown 提交于 8月 03, 2011

This needs to be an out of band value for the register and on this device
registers are 16 bit so we must shift left one to the 17th bit.
Signed-off-by: NMark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@kernel.org
Signed-off-by: NSamuel Ortiz <sameo@linux.intel.com>

8efcc57d

29 8月, 2011 1 次提交

perf events: Fix slow and broken cgroup context switch code · a8d757ef

由 Stephane Eranian 提交于 8月 25, 2011

The current cgroup context switch code was incorrect leading
to bogus counts. Furthermore, as soon as there was an active
cgroup event on a CPU, the context switch cost on that CPU
would increase by a significant amount as demonstrated by a
simple ping/pong example:

 $ ./pong
 Both processes pinned to CPU1, running for 10s
 10684.51 ctxsw/s

Now start a cgroup perf stat:
 $ perf stat -e cycles,cycles -A -a -G test  -C 1 -- sleep 100

$ ./pong
 Both processes pinned to CPU1, running for 10s
 6674.61 ctxsw/s

That's a 37% penalty.

Note that pong is not even in the monitored cgroup.

The results shown by perf stat are bogus:
 $ perf stat -e cycles,cycles -A -a -G test  -C 1 -- sleep 100

 Performance counter stats for 'sleep 100':

 CPU1 <not counted> cycles   test
 CPU1 16,984,189,138 cycles  #    0.000 GHz

The second 'cycles' event should report a count @ CPU clock
(here 2.4GHz) as it is counting across all cgroups.

The patch below fixes the bogus accounting and bypasses any
cgroup switches in case the outgoing and incoming tasks are
in the same cgroup.

With this patch the same test now yields:
 $ ./pong
 Both processes pinned to CPU1, running for 10s
 10775.30 ctxsw/s

Start perf stat with cgroup:

 $ perf stat -e cycles,cycles -A -a -G test  -C 1 -- sleep 10

Run pong outside the cgroup:
 $ /pong
 Both processes pinned to CPU1, running for 10s
 10687.80 ctxsw/s

The penalty is now less than 2%.

And the results for perf stat are correct:

$ perf stat -e cycles,cycles -A -a -G test  -C 1 -- sleep 10

 Performance counter stats for 'sleep 10':

 CPU1 <not counted> cycles test #    0.000 GHz
 CPU1 23,933,981,448 cycles      #    0.000 GHz

Now perf stat reports the correct counts for
for the non cgroup event.

If we run pong inside the cgroup, then we also get the
correct counts:

$ perf stat -e cycles,cycles -A -a -G test  -C 1 -- sleep 10

 Performance counter stats for 'sleep 10':

 CPU1 22,297,726,205 cycles test #    0.000 GHz
 CPU1 23,933,981,448 cycles      #    0.000 GHz

      10.001457237 seconds time elapsed
Signed-off-by: NStephane Eranian <eranian@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110825135803.GA4697@quadSigned-off-by: NIngo Molnar <mingo@elte.hu>

a8d757ef

27 8月, 2011 1 次提交

All Arch: remove linkage for sys_nfsservctl system call · f5b94099

由 NeilBrown 提交于 8月 26, 2011

The nfsservctl system call is now gone, so we should remove all
linkage for it.
Signed-off-by: NNeilBrown <neilb@suse.de>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f5b94099

26 8月, 2011 5 次提交

backlight: add a callback 'notify_after' for backlight control · cc7993f6

由 Dilan Lee 提交于 8月 25, 2011

We need a callback to do some things after pwm_enable, pwm_disable
and pwm_config.
Signed-off-by: NDilan Lee <dilee@nvidia.com>
Reviewed-by: NRobert Morell <rmorell@nvidia.com>
Reviewed-by: NArun Murthy <arun.murthy@stericsson.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

cc7993f6

rapidio: fix use of non-compatible registers · 284fb68d

由 Alexandre Bounine 提交于 8月 25, 2011

Replace/remove use of RIO v.1.2 registers/bits that are not
forward-compatible with newer versions of RapidIO specification.

RapidIO specification v.1.3 removed Write Port CSR, Doorbell CSR,
Mailbox CSR and Mailbox and Doorbell bits of the PEF CAR.

Use of removed (since RIO v.1.3) register bits affects users of
currently available 1.3 and 2.x compliant devices who may use not so
recent kernel versions.

Removing checks for unsupported bits makes corresponding routines
compatible with all versions of RapidIO specification.  Therefore,
backporting makes stable kernel versions compliant with RIO v.1.3 and
later as well.
Signed-off-by: NAlexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Chul Kim <chul.kim@idt.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

284fb68d

MAINTAINERS: Evgeniy has moved · a8018766

由 Evgeniy Polyakov 提交于 8月 25, 2011

Signed-off-by: NEvgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a8018766

lockdep: Add helper function for dir vs file i_mutex annotation · e096d0c7

由 Josh Boyer 提交于 8月 25, 2011

Purely in-memory filesystems do not use the inode hash as the dcache
tells us if an entry already exists.  As a result, they do not call
unlock_new_inode, and thus directory inodes do not get put into a
different lockdep class for i_sem.

We need the different lockdep classes, because the locking order for
i_mutex is different for directory inodes and regular inodes.  Directory
inodes can do "readdir()", which takes i_mutex *before* possibly taking
mm->mmap_sem (due to a page fault while copying the directory entry to
user space).

In contrast, regular inodes can be mmap'ed, which takes mm->mmap_sem
before accessing i_mutex.

The two cases can never happen for the same inode, so no real deadlock
can occur, but without the different lockdep classes, lockdep cannot
understand that.  As a result, if CONFIG_DEBUG_LOCK_ALLOC is set, this
can lead to false positives from lockdep like below:

    find/645 is trying to acquire lock:
     (&mm->mmap_sem){++++++}, at: [<ffffffff81109514>] might_fault+0x5c/0xac

    but task is already holding lock:
     (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffff81149f34>]
    vfs_readdir+0x5b/0xb4

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&sb->s_type->i_mutex_key#15){+.+.+.}:
          [<ffffffff8108ac26>] lock_acquire+0xbf/0x103
          [<ffffffff814db822>] __mutex_lock_common+0x4c/0x361
          [<ffffffff814dbc46>] mutex_lock_nested+0x40/0x45
          [<ffffffff811daa87>] hugetlbfs_file_mmap+0x82/0x110
          [<ffffffff81111557>] mmap_region+0x258/0x432
          [<ffffffff811119dd>] do_mmap_pgoff+0x2ac/0x306
          [<ffffffff81111b4f>] sys_mmap_pgoff+0x118/0x16a
          [<ffffffff8100c858>] sys_mmap+0x22/0x24
          [<ffffffff814e3ec2>] system_call_fastpath+0x16/0x1b

    -> #0 (&mm->mmap_sem){++++++}:
          [<ffffffff8108a4bc>] __lock_acquire+0xa1a/0xcf7
          [<ffffffff8108ac26>] lock_acquire+0xbf/0x103
          [<ffffffff81109541>] might_fault+0x89/0xac
          [<ffffffff81149cff>] filldir+0x6f/0xc7
          [<ffffffff811586ea>] dcache_readdir+0x67/0x205
          [<ffffffff81149f54>] vfs_readdir+0x7b/0xb4
          [<ffffffff8114a073>] sys_getdents+0x7e/0xd1
          [<ffffffff814e3ec2>] system_call_fastpath+0x16/0x1b

This patch moves the directory vs file lockdep annotation into a helper
function that can be called by in-memory filesystems and has hugetlbfs
call it.
Signed-off-by: NJosh Boyer <jwboyer@redhat.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e096d0c7

Add a personality to report 2.6.x version numbers · be27425d

由 Andi Kleen 提交于 8月 19, 2011

I ran into a couple of programs which broke with the new Linux 3.0
version. Some of those were binary only. I tried to use LD_PRELOAD to
work around it, but it was quite difficult and in one case impossible
because of a mix of 32bit and 64bit executables.

For example, all kind of management software from HP doesnt work, unless
we pretend to run a 2.6 kernel.

$ uname -a
Linux svivoipvnx001 3.0.0-08107-g97cd98f #1062 SMP Fri Aug 12 18:11:45 CEST 2011 i686 i686 i386 GNU/Linux

$ hpacucli ctrl all show

Error: No controllers detected.

$ rpm -qf /usr/sbin/hpacucli
hpacucli-8.75-12.0

Another notable case is that Python now reports "linux3" from
sys.platform(); which in turn can break things that were checking
sys.platform() == "linux2":

https://bugzilla.mozilla.org/show_bug.cgi?id=664564

It seems pretty clear to me though it's a bug in the apps that are using
'==' instead of .startswith(), but this allows us to unbreak broken
programs.

This patch adds a UNAME26 personality that makes the kernel report a
2.6.40+x version number instead. The x is the x in 3.x.

I know this is somewhat ugly, but I didn't find a better workaround, and
compatibility to existing programs is important.

Some programs also read /proc/sys/kernel/osrelease. This can be worked
around in user space with mount --bind (and a mount namespace)

To use:

wget ftp://ftp.kernel.org/pub/linux/kernel/people/ak/uname26/uname26.c
gcc -o uname26 uname26.c
./uname26 program
Signed-off-by: NAndi Kleen <ak@linux.intel.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

be27425d

25 8月, 2011 9 次提交

PM / Domains: Preliminary support for devices with power.irq_safe set · 0aa2a221

由 Rafael J. Wysocki 提交于 8月 25, 2011

The generic PM domains framework currently doesn't work with devices
whose power.irq_safe flag is set, because runtime PM callbacks for
such devices are run with interrupts disabled and the callbacks
provided by the generic PM domains framework use domain mutexes
and may sleep.  However, such devices very well may belong to
power domains on some systems, so the generic PM domains framework
should take them into account.

For this reason, modify the generic PM domains framework so that the
domain .power_off() and .power_on() callbacks are never executed for
a domain containing devices with power.irq_safe set, although the
.stop_device() and .start_device() callbacks are still run for them.

Additionally, introduce a flag allowing the creator of a
struct generic_pm_domain object to indicate that its .stop_device()
and .start_device() callbacks may be run in interrupt context
(might_sleep_if() triggers if that flag is not set and one of those
callbacks is run in interrupt context).
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

0aa2a221

PM QoS: Add global notification mechanism for device constraints · b66213cd

由 Jean Pihet 提交于 8月 25, 2011

Add a global notification chain that gets called upon changes to the
aggregated constraint value for any device.
The notification callbacks are passing the full constraint request data
in order for the callees to have access to it. The current use is for the
platform low-level code to access the target device of the constraint.
Signed-off-by: NJean Pihet <j-pihet@ti.com>
Reviewed-by: NKevin Hilman <khilman@ti.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

b66213cd

PM QoS: Implement per-device PM QoS constraints · 91ff4cb8

由 Jean Pihet 提交于 8月 25, 2011

Implement the per-device PM QoS constraints by creating a device
PM QoS API, which calls the PM QoS constraints management core code.

The per-device latency constraints data strctures are stored
in the device dev_pm_info struct.

The device PM code calls the init and destroy of the per-device constraints
data struct in order to support the dynamic insertion and removal of the
devices in the system.

To minimize the data usage by the per-device constraints, the data struct
is only allocated at the first call to dev_pm_qos_add_request.
The data is later free'd when the device is removed from the system.
A global mutex protects the constraints users from the data being
allocated and free'd.
Signed-off-by: NJean Pihet <j-pihet@ti.com>
Reviewed-by: NKevin Hilman <khilman@ti.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

91ff4cb8

PM QoS: Generalize and export constraints management code · abe98ec2

由 Jean Pihet 提交于 8月 25, 2011

In preparation for the per-device constratins support:
 - rename update_target to pm_qos_update_target
 - generalize and export pm_qos_update_target for usage by the upcoming
   per-device latency constraints framework:
   * operate on struct pm_qos_constraints for constraints management,
   * introduce an 'action' parameter for constraints add/update/remove,
   * the return value indicates if the aggregated constraint value has
     changed,
 - update the internal code to operate on struct pm_qos_constraints
 - add a NULL pointer check in the API functions
Signed-off-by: NJean Pihet <j-pihet@ti.com>
Reviewed-by: NKevin Hilman <khilman@ti.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

abe98ec2

PM QoS: Reorganize data structs · 4e1779ba

由 Jean Pihet 提交于 8月 25, 2011

In preparation for the per-device constratins support, re-organize
the data strctures:
 - add a struct pm_qos_constraints which contains the constraints
 related data
 - update struct pm_qos_object contents to the PM QoS internal object
 data. Add a pointer to struct pm_qos_constraints
 - update the internal code to use the new data structs.
Signed-off-by: NJean Pihet <j-pihet@ti.com>
Reviewed-by: NKevin Hilman <khilman@ti.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

4e1779ba

PM QoS: Minor clean-ups · cc749986

由 Jean Pihet 提交于 8月 25, 2011

 - Misc fixes to improve code readability:
  * rename struct pm_qos_request_list to struct pm_qos_request,
  * rename pm_qos_req parameter to req in internal code,
    consistenly use req in the API parameters,
  * update the in-kernel API callers to the new parameters names,
  * rename of fields names (requests, list, node, constraints)
Signed-off-by: NJean Pihet <j-pihet@ti.com>
Acked-by: Nmarkgross <markgross@thegnar.org>
Reviewed-by: NKevin Hilman <khilman@ti.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

cc749986

PM QoS: Move and rename the implementation files · e8db0be1

由 Jean Pihet 提交于 8月 25, 2011

The PM QoS implementation files are better named
kernel/power/qos.c and include/linux/pm_qos.h.

The PM QoS support is compiled under the CONFIG_PM option.
Signed-off-by: NJean Pihet <j-pihet@ti.com>
Acked-by: Nmarkgross <markgross@thegnar.org>
Reviewed-by: NKevin Hilman <khilman@ti.com>
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

e8db0be1

PM: Move clock-related definitions and headers to separate file · b5e8d269

由 Rafael J. Wysocki 提交于 8月 25, 2011

Since the PM clock management code in drivers/base/power/clock_ops.c
is used for both runtime PM and system suspend/hibernation, the
definitions of data structures and headers related to it should not
be located in include/linux/pm_rumtime.h.  Move them to a separate
header file.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

b5e8d269

PM / Domains: Use power.sybsys_data to reduce overhead · 4605ab65

由 Rafael J. Wysocki 提交于 8月 25, 2011

Currently pm_genpd_runtime_resume() has to walk the list of devices
from the device's PM domain to find the corresponding device list
object containing the need_restore field to check if the driver's
.runtime_resume() callback should be executed for the device.
This is suboptimal and can be simplified by using power.sybsys_data
to store device information used by the generic PM domains code.
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

4605ab65

OpenHarmony / kernel_linux 上一次同步 4 年多

OpenHarmony / kernel_linux
上一次同步 4 年多