提交 · e82b8e4ea4f3dffe6e7939f90e78da675fcc450e · openanolis / cloud-kernel

19 10月, 2010 12 次提交

由 Venkatesh Pallipadi 提交于 10月 04, 2010

This patch adds IRQ_TIME_ACCOUNTING option on x86 and runtime enables it
when TSC is enabled.

This change just enables fine grained irq time accounting, isn't used yet.
Following patches use it for different purposes.
Signed-off-by: NVenkatesh Pallipadi <venki@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-6-git-send-email-venki@google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e82b8e4e

sched: Add IRQ_TIME_ACCOUNTING, finer accounting of irq time · b52bfee4

由 Venkatesh Pallipadi 提交于 10月 04, 2010

s390/powerpc/ia64 have support for CONFIG_VIRT_CPU_ACCOUNTING which does
the fine granularity accounting of user, system, hardirq, softirq times.
Adding that option on archs like x86 will be challenging however, given the
state of TSC reliability on various platforms and also the overhead it will
add in syscall entry exit.

Instead, add a lighter variant that only does finer accounting of
hardirq and softirq times, providing precise irq times (instead of timer tick
based samples). This accounting is added with a new config option
CONFIG_IRQ_TIME_ACCOUNTING so that there won't be any overhead for users not
interested in paying the perf penalty.

This accounting is based on sched_clock, with the code being generic.
So, other archs may find it useful as well.

This patch just adds the core logic and does not enable this logic yet.
Signed-off-by: NVenkatesh Pallipadi <venki@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-5-git-send-email-venki@google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b52bfee4

sched: Add a PF flag for ksoftirqd identification · 6cdd5199

由 Venkatesh Pallipadi 提交于 10月 04, 2010

To account softirq time cleanly in scheduler, we need to identify whether
softirq is invoked in ksoftirqd context or softirq at hardirq tail context.
Add PF_KSOFTIRQD for that purpose.

As all PF flag bits are currently taken, create space by moving one of the
infrequently used bits (PF_THREAD_BOUND) down in task_struct to be along
with some other state fields.
Signed-off-by: NVenkatesh Pallipadi <venki@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-4-git-send-email-venki@google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

6cdd5199

sched: Consolidate account_system_vtime extern declaration · e1e10a26

由 Venkatesh Pallipadi 提交于 10月 04, 2010

Just a minor cleanup patch that makes things easier to the following patches.
No functionality change in this patch.
Signed-off-by: NVenkatesh Pallipadi <venki@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-3-git-send-email-venki@google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

e1e10a26

sched: Fix softirq time accounting · 75e1056f

由 Venkatesh Pallipadi 提交于 10月 04, 2010

Peter Zijlstra found a bug in the way softirq time is accounted in
VIRT_CPU_ACCOUNTING on this thread:

http://lkml.indiana.edu/hypermail//linux/kernel/1009.2/01366.html

The problem is, softirq processing uses local_bh_disable internally. There
is no way, later in the flow, to differentiate between whether softirq is
being processed or is it just that bh has been disabled. So, a hardirq when bh
is disabled results in time being wrongly accounted as softirq.

Looking at the code a bit more, the problem exists in !VIRT_CPU_ACCOUNTING
as well. As account_system_time() in normal tick based accouting also uses
softirq_count, which will be set even when not in softirq with bh disabled.

Peter also suggested solution of using 2*SOFTIRQ_OFFSET as irq count
for local_bh_{disable,enable} and using just SOFTIRQ_OFFSET while softirq
processing. The patch below does that and adds API in_serving_softirq() which
returns whether we are currently processing softirq or not.

Also changes one of the usages of softirq_count in net/sched/cls_cgroup.c
to in_serving_softirq.

Looks like many usages of in_softirq really want in_serving_softirq. Those
changes can be made individually on a case by case basis.
Signed-off-by: NVenkatesh Pallipadi <venki@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286237003-12406-2-git-send-email-venki@google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

75e1056f

sched: Drop group_capacity to 1 only if local group has extra capacity · 75dd321d

由 Nikhil Rao 提交于 10月 15, 2010

When SD_PREFER_SIBLING is set on a sched domain, drop group_capacity to 1
only if the local group has extra capacity. The extra check prevents the case
where you always pull from the heaviest group when it is already under-utilized
(possible with a large weight task outweighs the tasks on the system).

For example, consider a 16-cpu quad-core quad-socket machine with MC and NUMA
scheduling domains. Let's say we spawn 15 nice0 tasks and one nice-15 task,
and each task is running on one core. In this case, we observe the following
events when balancing at the NUMA domain:

- find_busiest_group() will always pick the sched group containing the niced
  task to be the busiest group.
- find_busiest_queue() will then always pick one of the cpus running the
  nice0 task (never picks the cpu with the nice -15 task since
  weighted_cpuload > imbalance).
- The load balancer fails to migrate the task since it is the running task
  and increments sd->nr_balance_failed.
- It repeats the above steps a few more times until sd->nr_balance_failed > 5,
  at which point it kicks off the active load balancer, wakes up the migration
  thread and kicks the nice 0 task off the cpu.

The load balancer doesn't stop until we kick out all nice 0 tasks from
the sched group, leaving you with 3 idle cpus and one cpu running the
nice -15 task.

When balancing at the NUMA domain, we drop sgs.group_capacity to 1 if the child
domain (in this case MC) has SD_PREFER_SIBLING set.  Subsequent load checks are
not relevant because the niced task has a very large weight.

In this patch, we add an extra condition to the "if(prefer_sibling)" check in
update_sd_lb_stats(). We drop the capacity of a group only if the local group
has extra capacity, ie. nr_running < group_capacity. This patch preserves the
original intent of the prefer_siblings check (to spread tasks across the system
in low utilization scenarios) and fixes the case above.

It helps in the following ways:
- In low utilization cases (where nr_tasks << nr_cpus), we still drop
  group_capacity down to 1 if we prefer siblings.
- On very busy systems (where nr_tasks >> nr_cpus), sgs.nr_running will most
  likely be > sgs.group_capacity.
- When balancing large weight tasks, if the local group does not have extra
  capacity, we do not pick the group with the niced task as the busiest group.
  This prevents failed balances, active migration and the under-utilization
  described above.
Signed-off-by: NNikhil Rao <ncrao@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1287173550-30365-5-git-send-email-ncrao@google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

75dd321d

sched: Force balancing on newidle balance if local group has capacity · fab47622

由 Nikhil Rao 提交于 10月 15, 2010

This patch forces a load balance on a newly idle cpu when the local group has
extra capacity and the busiest group does not have any. It improves system
utilization when balancing tasks with a large weight differential.

Under certain situations, such as a niced down task (i.e. nice = -15) in the
presence of nr_cpus NICE0 tasks, the niced task lands on a sched group and
kicks away other tasks because of its large weight. This leads to sub-optimal
utilization of the machine. Even though the sched group has capacity, it does
not pull tasks because sds.this_load >> sds.max_load, and f_b_g() returns NULL.

With this patch, if the local group has extra capacity, we shortcut the checks
in f_b_g() and try to pull a task over. A sched group has extra capacity if the
group capacity is greater than the number of running tasks in that group.

Thanks to Mike Galbraith for discussions leading to this patch and for the
insight to reuse SD_NEWIDLE_BALANCE.
Signed-off-by: NNikhil Rao <ncrao@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1287173550-30365-4-git-send-email-ncrao@google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

fab47622

sched: Set group_imb only a task can be pulled from the busiest cpu · 2582f0eb

由 Nikhil Rao 提交于 10月 13, 2010

When cycling through sched groups to determine the busiest group, set
group_imb only if the busiest cpu has more than 1 runnable task. This patch
fixes the case where two cpus in a group have one runnable task each, but there
is a large weight differential between these two tasks. The load balancer is
unable to migrate any task from this group, and hence do not consider this
group to be imbalanced.
Signed-off-by: NNikhil Rao <ncrao@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286996978-7007-3-git-send-email-ncrao@google.com>
[ small code readability edits ]
Signed-off-by: NIngo Molnar <mingo@elte.hu>

2582f0eb

sched: Do not consider SCHED_IDLE tasks to be cache hot · ef8002f6

由 Nikhil Rao 提交于 10月 13, 2010

This patch adds a check in task_hot to return if the task has SCHED_IDLE
policy. SCHED_IDLE tasks have very low weight, and when run with regular
workloads, are typically scheduled many milliseconds apart. There is no
need to consider these tasks hot for load balancing.
Signed-off-by: NNikhil Rao <ncrao@google.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1287173550-30365-2-git-send-email-ncrao@google.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ef8002f6

sched: Drop all load weight manipulation for RT tasks · 17bdcf94

由 Linus Walleij 提交于 10月 11, 2010

Load weights are for the CFS, they do not belong in the RT task. This makes all
RT scheduling classes leave the CFS weights alone.

This fixes a real bug as well: I noticed the following phonomena: a process
elevated to SCHED_RR forks with SCHED_RESET_ON_FORK set, and the child is
indeed SCHED_OTHER, and the niceval is indeed reset to 0. However the weight
inserted by set_load_weight() remains at 0, giving the task insignificat
priority.

With this fix, the weight is reset to what the task had before being elevated
to SCHED_RR/SCHED_FIFO.

Cc: Lennart Poettering <lennart@poettering.net>
Cc: stable@kernel.org
Signed-off-by: NLinus Walleij <linus.walleij@stericsson.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1286807811-10568-1-git-send-email-linus.walleij@stericsson.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

17bdcf94

sched: Create special class for stop/migrate work · 34f971f6

由 Peter Zijlstra 提交于 9月 22, 2010

In order to separate the stop/migrate work thread from the SCHED_FIFO
implementation, create a special class for it that is of higher priority than
SCHED_FIFO itself.

This currently solves a problem where cpu-hotplug consumes so much cpu-time
that the SCHED_FIFO class gets throttled, but has the bandwidth replenishment
timer pending on the now dead cpu.

It is also required for when we add the planned deadline scheduling class above
SCHED_FIFO, as the stop/migrate thread still needs to transcent those tasks.
Tested-by: NHeiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1285165776.2275.1022.camel@laptop>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

34f971f6

sched: Unindent labels · 49246274

由 Peter Zijlstra 提交于 10月 17, 2010

Labels should be on column 0.
Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

49246274

14 10月, 2010 11 次提交

sched: Comment updates: fix default latency and granularity numbers · 864616ee

由 Takuya Yoshikawa 提交于 10月 14, 2010

Targeted preemption latency and minimal preemption granularity
for CPU-bound tasks have been changed.

This patch updates the comments about these values.
Signed-off-by: NTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20101014160913.eb24fef4.yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

864616ee

Merge branch 'linus' into sched/core · ed859ed3

由 Ingo Molnar 提交于 10月 14, 2010

Merge reason: update from -rc5 to -almost-final
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ed859ed3

L
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx · 53eeb64e
由 Linus Torvalds 提交于 10月 13, 2010
```
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
  ioat2: fix performance regression
```
53eeb64e
L
Merge branch 'for-2.6.36' of git://linux-nfs.org/~bfields/linux · 8c35bf36
由 Linus Torvalds 提交于 10月 13, 2010
```
* 'for-2.6.36' of git://linux-nfs.org/~bfields/linux:
  nfsd: fix BUG at fs/nfsd/nfsfh.h:199 on unlink
```
8c35bf36

Merge branch 'perf-fixes-for-linus' of... · fec896e2

由 Linus Torvalds 提交于 10月 13, 2010

Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ring-buffer: Fix typo of time extends per page
  perf, MIPS: Support cross compiling of tools/perf for MIPS
  perf: Fix incorrect copy_from_user() usage

fec896e2

Merge master.kernel.org:/home/rmk/linux-2.6-arm · d94bc4fc

由 Linus Torvalds 提交于 10月 13, 2010

* master.kernel.org:/home/rmk/linux-2.6-arm:
  ARM: relax ioremap prohibition (309caa9c) for -final and -stable
  ARM: 6440/1: ep93xx: DMA: fix channel_disable
  cpuimx27: fix i2c bus selection
  cpuimx27: fix compile when ULPI is selected
  ARM: 6435/1: Fix HWCAP_TLS flag for ARM11MPCore/Cortex-A9
  ARM: 6436/1: AT91: Fix power-saving in idle-mode on 926T processors
  ARM: fix section mismatch warnings in Versatile Express
  ARM: 6412/1: kprobes-decode: add support for MOVW instruction
  ARM: 6419/1: mmu: Fix MT_MEMORY and MT_MEMORY_NONCACHED pte flags
  ARM: 6416/1: errata: faulty hazard checking in the Store Buffer may lead to data corruption

d94bc4fc

Merge branch 'omap-fixes-for-linus' of... · 70813196

由 Linus Torvalds 提交于 10月 13, 2010

Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6

* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
  omap: iommu-load cam register before flushing the entry

70813196

Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 · a56f31a0

由 Linus Torvalds 提交于 10月 13, 2010

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: Silent spurious error message
  drm/radeon/kms: fix bad cast/shift in evergreen.c
  drm/radeon/kms: make TV/DFP table info less verbose
  drm/radeon/kms: leave certain CP int bits enabled
  drm/radeon/kms: avoid corner case issue with unmappable vram V2

a56f31a0

Merge branch 'x86-fixes-for-linus' of... · 509d4486

由 Linus Torvalds 提交于 10月 13, 2010

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, numa: For each node, register the memory blocks actually used
  x86, AMD, MCE thresholding: Fix the MCi_MISCj iteration order
  x86, mce, therm_throt.c: Fix missing curly braces in error handling logic

509d4486

ioat2: fix performance regression · c50a898f

由 Dan Williams 提交于 10月 13, 2010

Commit 07934481 "DMAENGINE: generic channel status v2" changed the interface for
how dma channel progress is retrieved.  It inadvertently exported an internal
helper function ioat_tx_status() instead of ioat_dma_tx_status().  The latter
polls the hardware to get the latest completion state, while the helper just
evaluates the current state without touching hardware.  The effect is that we
end up waiting for completion timeouts or descriptor allocation errors before
the completion state is updated.

iperf (before fix):
[SUM]  0.0-41.3 sec   364 MBytes  73.9 Mbits/sec

iperf (after fix):
[SUM]  0.0- 4.5 sec   499 MBytes   940 Mbits/sec

This is a regression starting with 2.6.35.

Cc: <stable@kernel.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Linus Walleij <linus.walleij@stericsson.com>
Cc: Maciej Sosnowski <maciej.sosnowski@intel.com>
Reported-by: NRichard Scobie <richard@sauce.co.nz>
Signed-off-by: NDan Williams <dan.j.williams@intel.com>

c50a898f

nfsd: fix BUG at fs/nfsd/nfsfh.h:199 on unlink · b1e86db1

由 J. Bruce Fields 提交于 10月 13, 2010

As of commit 43a9aa64 "NFSD:
Fill in WCC data for REMOVE, RMDIR, MKNOD, and MKDIR", we sometimes call
fh_unlock on a filehandle that isn't fully initialized.

We should fix up the callers, but as a quick fix it is also sufficient
just to remove this assertion.
Reported-by: NMarius Tolzmann <tolzmann@molgen.mpg.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: NJ. Bruce Fields <bfields@redhat.com>

b1e86db1

13 10月, 2010 5 次提交

ARM: relax ioremap prohibition () for -final and -stable · 06c10884

由 Russell King 提交于 10月 13, 2010

... but produce a big warning about the problem as encouragement
for people to fix their drivers.
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

06c10884

R

Merge branch 'for-rmk' of git://git.pengutronix.de/git/imx/linux-2.6 · 841f48a8
由 Russell King 提交于 10月 12, 2010

841f48a8

ARM: 6440/1: ep93xx: DMA: fix channel_disable · 10d48b39

由 Mika Westerberg 提交于 10月 12, 2010

When channel_disable() is called, it disables per channel interrupts and
waits until channels state becomes STATE_STALL, and then disables the
channel. Now, if the DMA transfer is disabled while the channel is in
STATE_NEXT we will not wait anything and disable the channel immediately.
This seems to cause weird data corruption for example in audio transfers.

Fix is to wait while we are in STATE_NEXT or STATE_ON and only then
disable the channel.
Signed-off-by: NMika Westerberg <mika.westerberg@iki.fi>
Acked-by: NRyan Mallon <ryan@bluewatersys.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

10d48b39

Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 0acc1b2a

由 Linus Torvalds 提交于 10月 12, 2010

* 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Move TSC reset out of vmcb_init
  KVM: x86: Fix SVM VMCB reset

0acc1b2a

ring-buffer: Fix typo of time extends per page · d0134324

由 Steven Rostedt 提交于 10月 12, 2010

Time stamps for the ring buffer are created by the difference between
two events. Each page of the ring buffer holds a full 64 bit timestamp.
Each event has a 27 bit delta stamp from the last event. The unit of time
is nanoseconds, so 27 bits can hold ~134 milliseconds. If two events
happen more than 134 milliseconds apart, a time extend is inserted
to add more bits for the delta. The time extend has 59 bits, which
is good for ~18 years.

Currently the time extend is committed separately from the event.
If an event is discarded before it is committed, due to filtering,
the time extend still exists. If all events are being filtered, then
after ~134 milliseconds a new time extend will be added to the buffer.

This can only happen till the end of the page. Since each page holds
a full timestamp, there is no reason to add a time extend to the
beginning of a page. Time extends can only fill a page that has actual
data at the beginning, so there is no fear that time extends will fill
more than a page without any data.

When reading an event, a loop is made to skip over time extends
since they are only used to maintain the time stamp and are never
given to the caller. As a paranoid check to prevent the loop running
forever, with the knowledge that time extends may only fill a page,
a check is made that tests the iteration of the loop, and if the
iteration is more than the number of time extends that can fit in a page
a warning is printed and the ring buffer is disabled (all of ftrace
is also disabled with it).

There is another event type that is called a TIMESTAMP which can
hold 64 bits of data in the theoretical case that two events happen
18 years apart. This code has not been implemented, but the name
of this event exists, as well as the structure for it. The
size of a TIMESTAMP is 16 bytes, where as a time extend is only
8 bytes. The macro used to calculate how many time extends can fit on
a page used the TIMESTAMP size instead of the time extend size
cutting the amount in half.

The following test case can easily trigger the warning since we only
need to have half the page filled with time extends to trigger the
warning:

 # cd /sys/kernel/debug/tracing/
 # echo function > current_tracer
 # echo 'common_pid < 0' > events/ftrace/function/filter
 # echo > trace
 # echo 1 > trace_marker
 # sleep 120
 # cat trace

Enabling the function tracer and then setting the filter to only trace
functions where the process id is negative (no events), then clearing
the trace buffer to ensure that we have nothing in the buffer,
then write to trace_marker to add an event to the beginning of a page,
sleep for 2 minutes (only 35 seconds is probably needed, but this
guarantees the bug), and then finally reading the trace which will
trigger the bug.

This patch fixes the typo and prevents the false positive of that warning.
Reported-by: NHans J. Koch <hjk@linutronix.de>
Tested-by: NHans J. Koch <hjk@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stable Kernel <stable@kernel.org>
Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>

d0134324

12 10月, 2010 12 次提交

perf, MIPS: Support cross compiling of tools/perf for MIPS · c1e028ef

由 Deng-Cheng Zhu 提交于 10月 12, 2010

Changes:
 v4: Fix the cosmetic issue of redundant dot-ops
 v3: Change rmb() to use SYNC
 v2: Include mips unistd.h and define rmb()/cpu_relax() in tools/perf/perf.h
Signed-off-by: NDeng-Cheng Zhu <dengcheng.zhu@gmail.com>
Acked-by: NRalf Baechle <ralf@linux-mips.org>
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c1e028ef

drm/radeon/kms: Silent spurious error message · a8c051f0

由 Jean Delvare 提交于 10月 08, 2010

I see the following error message in my kernel log from time to time:
radeon 0000:07:00.0: ffff88007c334000 reserve failed for wait
radeon 0000:07:00.0: ffff88007c334000 reserve failed for wait

After investigation, it turns out that there's nothing to be afraid of
and everything works as intended. So remove the spurious log message.
Signed-off-by: NJean Delvare <khali@linux-fr.org>
Reviewed-by: NJerome Glisse <jglisse@redhat.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

a8c051f0

drm/radeon/kms: fix bad cast/shift in evergreen.c · d31dba58

由 Alex Deucher 提交于 10月 11, 2010

Missing parens.

fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=30718Reported-by: NDave Gilbert <freedesktop@treblig.org>
Signed-off-by: NAlex Deucher <alexdeucher@gmail.com>
Reviewed-by: NMatt Turner <mattst88@gmail.com>
Cc: stable@kernel.org
Signed-off-by: NDave Airlie <airlied@redhat.com>

d31dba58

drm/radeon/kms: make TV/DFP table info less verbose · 40f76d81

由 Alex Deucher 提交于 10月 07, 2010

Make TV standard and DFP table revisions debug only.
Signed-off-by: NAlex Deucher <alexdeucher@gmail.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

40f76d81

drm/radeon/kms: leave certain CP int bits enabled · 3555e53b

由 Alex Deucher 提交于 10月 08, 2010

These bits are used for internal communication and should
be left enabled.  This may fix s/r issues on some systems.
Signed-off-by: NAlex Deucher <alexdeucher@gmail.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

3555e53b

drm/radeon/kms: avoid corner case issue with unmappable vram V2 · c919b371

由 Jerome Glisse 提交于 8月 10, 2010

We should not allocate any object into unmappable vram if we
have no means to access them which on all GPU means having the
CP running and on newer GPU having the blit utility working.

This patch limit the vram allocation to visible vram until
we have acceleration up and running.

Note that it's more than unlikely that we run into any issue
related to that as when acceleration is not woring userspace
should allocate any object in vram beside front buffer which
should fit in visible vram.

V2 use real_vram_size as mc_vram_size could be bigger than
   the actual amount of vram

[airlied: fixup r700_cp_stop case]
Signed-off-by: NJerome Glisse <jglisse@redhat.com>
Signed-off-by: NDave Airlie <airlied@redhat.com>

c919b371

perf: Fix incorrect copy_from_user() usage · ad0cf347

由 John Blackwood 提交于 9月 28, 2010

perf events: repair incorrect use of copy_from_user

This makes the perf_event_period() return 0 instead of
-EFAULT on success.

Signed-off-by: John Blackwood<john.blackwood@ccur.com>
Signed-off-by: NJoe Korty <joe.korty@ccur.com>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100928220311.GA18145@tsunami.ccur.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

ad0cf347

fanotify: disable fanotify syscalls · 7c534773

由 Eric Paris 提交于 10月 11, 2010

This patch disables the fanotify syscalls by just not building them and
letting the cond_syscall() statements in kernel/sys_ni.c redirect them
to sys_ni_syscall().

It was pointed out by Tvrtko Ursulin that the fanotify interface did not
include an explicit prioritization between groups.  This is necessary
for fanotify to be usable for hierarchical storage management software,
as they must get first access to the file, before inotify-like notifiers
see the file.

This feature can be added in an ABI compatible way in the next release
(by using a number of bits in the flags field to carry the info) but it
was suggested by Alan that maybe we should just hold off and do it in
the next cycle, likely with an (new) explicit argument to the syscall.
I don't like this approach best as I know people are already starting to
use the current interface, but Alan is all wise and noone on list backed
me up with just using what we have.  I feel this is needlessly ripping
the rug out from under people at the last minute, but if others think it
needs to be a new argument it might be the best way forward.

Three choices:
Go with what we got (and implement the new feature next cycle).  Add a
new field right now (and implement the new feature next cycle).  Wait
till next cycle to release the ABI (and implement the new feature next
cycle).  This is number 3.
Signed-off-by: NEric Paris <eparis@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7c534773

x86, numa: For each node, register the memory blocks actually used · 73cf624d

由 Yinghai Lu 提交于 10月 10, 2010

Russ reported SGI UV is broken recently. He said:

| The SRAT table shows that memory range is spread over two nodes.
|
| SRAT: Node 0 PXM 0 100000000-800000000
| SRAT: Node 1 PXM 1 800000000-1000000000
| SRAT: Node 0 PXM 0 1000000000-1080000000
|
|Previously, the kernel early_node_map[] would show three entries
|with the proper node.
|
|[    0.000000]     0: 0x00100000 -> 0x00800000
|[    0.000000]     1: 0x00800000 -> 0x01000000
|[    0.000000]     0: 0x01000000 -> 0x01080000
|
|The problem is recent community kernel early_node_map[] shows
|only two entries with the node 0 entry overlapping the node 1
|entry.
|
|    0: 0x00100000 -> 0x01080000
|    1: 0x00800000 -> 0x01000000

After looking at the changelog, Found out that it has been broken for a while by
following commit

|commit 8716273c
|Author: David Rientjes <rientjes@google.com>
|Date:   Fri Sep 25 15:20:04 2009 -0700
|
|    x86: Export srat physical topology

Before that commit, register_active_regions() is called for every SRAT memory
entry right away.

Use nodememblk_range[] instead of nodes[] in order to make sure we
capture the actual memory blocks registered with each node.  nodes[]
contains an extended range which spans all memory regions associated
with a node, but that does not mean that all the memory in between are
included.
Reported-by: NRuss Anderson <rja@sgi.com>
Tested-by: NRuss Anderson <rja@sgi.com>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <4CB27BDF.5000800@kernel.org>
Acked-by: NDavid Rientjes <rientjes@google.com>
Cc: <stable@kernel.org> 2.6.33 .34 .35 .36
Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>

73cf624d

Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6 · 29979aa8

由 Linus Torvalds 提交于 10月 11, 2010

* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  kbuild: fix oldnoconfig to do the right thing
  kconfig: Temporarily disable dependency warnings
  kconfig: delay symbol direct dependency initialization

29979aa8

Merge branch 'for_linus' of... · 50c6dc9e

由 Linus Torvalds 提交于 10月 11, 2010

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86:
  IPS driver: Fix limit clamping when reducing CPU power
  [PATCH 2/2] IPS driver: disable CPU turbo
  IPS driver: apply BIOS provided CPU limit if different from default
  intel_ips -- ensure we do not enable gpu turbo mode without driver linkage
  intel_ips: Print MCP limit exceeded values.
  IPS driver: verify BIOS provided limits
  IPS driver: don't toggle CPU turbo on unsupported CPUs
  NULL pointer might be used in ips_monitor()
  Release symbol on error-handling path of ips_get_i915_syms()
  old_cpu_power is wrongly divided by 65535 in ips_monitor()
  seqno mask of THM_ITV register is 16bit

50c6dc9e

L
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 6af0b78c
由 Linus Torvalds 提交于 10月 11, 2010
```
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: wacom - fix pressure in Cintiq 21UX2
```
6af0b78c

openanolis / cloud-kernel 接近 2 年 前同步成功

openanolis / cloud-kernel
接近 2 年前同步成功