提交 · d706c1b050274b3bf97d7cb0542c0d070c9ccb8b · gsplhtlxg / clone-Linux

29 4月, 2010 4 次提交

driver-core: Add device node pointer to struct device · d706c1b0

由 Grant Likely 提交于 4月 13, 2010

Currently, platforms using CONFIG_OF add a 'struct device_node *of_node'
to dev->archdata.  However, with CONFIG_OF becoming generic for all
architectures, it makes sense for commonality to move it out of archdata
and into struct device proper.

This patch adds a struct device_node *of_node member to struct device
and updates all locations which currently write the device_node pointer
into archdata to also update dev->of_node.  Subsequent patches will
modify callers to use the archdata location and ultimately remove
the archdata member entirely.
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>
Acked-by: NGreg Kroah-Hartman <gregkh@suse.de>
CC: Michal Simek <monstr@monstr.eu>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: "David S. Miller" <davem@davemloft.net>
CC: Stephen Rothwell <sfr@canb.auug.org.au>
CC: Jeremy Kerr <jeremy.kerr@canonical.com>
CC: microblaze-uclinux@itee.uq.edu.au
CC: linux-kernel@vger.kernel.org
CC: linuxppc-dev@ozlabs.org
CC: sparclinux@vger.kernel.org

d706c1b0

of: protect contents of of_platform.h and of_device.h · efb2e014

由 Grant Likely 提交于 4月 13, 2010

Only process contents of of_platform.h and of_device.h if
CONFIG_OF_DEVICE is set.
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

efb2e014

of/flattree: Make unflatten_device_tree() safe to call from any arch · 8bfe9b5c

由 Grant Likely 提交于 4月 13, 2010

This patch makes unflatten_device_tree() safe to call from any arch
setup code with the following changes:
- Make sure initial_boot_params actually points to a device tree blob
  before unflattening
- Make sure the initial_boot_params->magic field is correct
- If CONFIG_OF_FLATTREE is not set, then make unflatten_device_tree()
  an empty static inline function.

This patch also adds some additional debug output to the top of
unflatten_device_tree().
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

8bfe9b5c

of/flattree: make of_fdt.h safe to unconditionally include. · cf32eb89

由 Grant Likely 提交于 4月 13, 2010

If CONFIG_OF_FLATTREE is not set, then don't process the body of
linux/of_fdt.h
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

cf32eb89

19 4月, 2010 1 次提交

rcu: Make RCU lockdep check the lockdep_recursion variable · bc293d62

由 Paul E. McKenney 提交于 4月 15, 2010

The lockdep facility temporarily disables lockdep checking by
incrementing the current->lockdep_recursion variable.  Such
disabling happens in NMIs and in other situations where lockdep
might expect to recurse on itself.

This patch therefore checks current->lockdep_recursion, disabling RCU
lockdep splats when this variable is non-zero.  In addition, this patch
removes the "likely()", as suggested by Lai Jiangshan.
Reported-by: NFrederic Weisbecker <fweisbec@gmail.com>
Reported-by: NDavid Miller <davem@davemloft.net>
Tested-by: NFrederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: eric.dumazet@gmail.com
LKML-Reference: <20100415195039.GA22623@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

bc293d62

15 4月, 2010 1 次提交

firewire: cdev: change license of exported header files to MIT license · 19b3eecc

由 Stefan Richter 提交于 4月 11, 2010

Among else, this allows projects like libdc1394 to carry copies of the
ABI related header files without them or distributors having to worry
about effects on the project's overall license terms.  Switch to MIT
license as suggested by Kristian.  Also update the year in the
copyright statement according to source history.

Cc: Jay Fenlason <fenlason@redhat.com>
Acked-by: NClemens Ladisch <clemens@ladisch.de>
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: NKristian Høgsberg <krh@bitplanet.net>

19b3eecc

14 4月, 2010 2 次提交

rcu: Better explain the condition parameter of rcu_dereference_check() · c08c68dd

由 David Howells 提交于 4月 09, 2010

Better explain the condition parameter of
rcu_dereference_check() that describes the conditions under
which the dereference is permitted to take place (and
incorporate Yong Zhang's suggestion).  This condition is only
checked under lockdep proving.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: eric.dumazet@gmail.com
LKML-Reference: <1270852752-25278-2-git-send-email-paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

c08c68dd

rcu: Add rcu_access_pointer and rcu_dereference_protected · b62730ba

由 Paul E. McKenney 提交于 4月 09, 2010

This patch adds variants of rcu_dereference() that handle
situations where the RCU-protected data structure cannot change,
perhaps due to our holding the update-side lock, or where the
RCU-protected pointer is only to be fetched, not dereferenced.
These are needed due to some performance concerns with using
rcu_dereference() where it is not required, aside from the need
for lockdep/sparse checking.

The new rcu_access_pointer() primitive is for the case where the
pointer is be fetch and not dereferenced.  This primitive may be
used without protection, RCU or otherwise, due to the fact that
it uses ACCESS_ONCE().

The new rcu_dereference_protected() primitive is for the case
where updates are prevented, for example, due to holding the
update-side lock.  This primitive does neither ACCESS_ONCE() nor
smp_read_barrier_depends(), so can only be used when updates are
somehow prevented.
Suggested-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: eric.dumazet@gmail.com
LKML-Reference: <1270852752-25278-1-git-send-email-paulmck@linux.vnet.ibm.com>
Signed-off-by: NIngo Molnar <mingo@elte.hu>

b62730ba

12 4月, 2010 1 次提交

NFSv4: fix delegated locking · 0df5dd4a

由 Trond Myklebust 提交于 4月 11, 2010

Arnaud Giersch reports that NFSv4 locking is broken when we hold a
delegation since commit 8e469ebd (NFSv4:
Don't allow posix locking against servers that don't support it).

According to Arnaud, the lock succeeds the first time he opens the file
(since we cannot do a delegated open) but then fails after we start using
delegated opens.

The following patch fixes it by ensuring that locking behaviour is
governed by a per-filesystem capability flag that is initially set, but
gets cleared if the server ever returns an OPEN without the
NFS4_OPEN_RESULT_LOCKTYPE_POSIX flag being set.
Reported-by: NArnaud Giersch <arnaud.giersch@iut-bm.univ-fcomte.fr>
Signed-off-by: NTrond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org

0df5dd4a

10 4月, 2010 4 次提交

S
firewire: cdev: comment fixlet · ca658b1e
由 Stefan Richter 提交于 4月 10, 2010
```
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>
```
ca658b1e

firewire: cdev: iso packet documentation · aa6fec3c

由 Clemens Ladisch 提交于 3月 31, 2010

Add the missing documentation for iso packets.
Signed-off-by: NClemens Ladisch <clemens@ladisch.de>
Signed-off-by: NStefan Richter <stefanr@s5r6.in-berlin.de>

aa6fec3c

radix_tree_tag_get() is not as safe as the docs make out [ver #2] · ce82653d

由 David Howells 提交于 4月 06, 2010

radix_tree_tag_get() is not safe to use concurrently with radix_tree_tag_set()
or radix_tree_tag_clear().  The problem is that the double tag_get() in
radix_tree_tag_get():

		if (!tag_get(node, tag, offset))
			saw_unset_tag = 1;
		if (height == 1) {
			int ret = tag_get(node, tag, offset);

may see the value change due to the action of set/clear.  RCU is no protection
against this as no pointers are being changed, no nodes are being replaced
according to a COW protocol - set/clear alter the node directly.

The documentation in linux/radix-tree.h, however, says that
radix_tree_tag_get() is an exception to the rule that "any function modifying
the tree or tags (...) must exclude other modifications, and exclude any
functions reading the tree".

The problem is that the next statement in radix_tree_tag_get() checks that the
tag doesn't vary over time:

			BUG_ON(ret && saw_unset_tag);

This has been seen happening in FS-Cache:

	https://www.redhat.com/archives/linux-cachefs/2010-April/msg00013.html

To this end, remove the BUG_ON() from radix_tree_tag_get() and note in various
comments that the value of the tag may change whilst the RCU read lock is held,
and thus that the return value of radix_tree_tag_get() may not be relied upon
unless radix_tree_tag_set/clear() and radix_tree_delete() are excluded from
running concurrently with it.
Reported-by: NRomain DEGEZ <romain.degez@smartjog.com>
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Acked-by: NNick Piggin <npiggin@suse.de>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ce82653d

slab: Generify kernel pointer validation · fc1c1833

由 Pekka Enberg 提交于 4月 07, 2010

As suggested by Linus, introduce a kern_ptr_validate() helper that does some
sanity checks to make sure a pointer is a valid kernel pointer.  This is a
preparational step for fixing SLUB kmem_ptr_validate().

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

fc1c1833

09 4月, 2010 1 次提交

libata: Fix accesses at LBA28 boundary (old bug, but nasty) (v2) · 45c4d015

由 Mark Lord 提交于 4月 07, 2010

Most drives from Seagate, Hitachi, and possibly other brands,
do not allow LBA28 access to sector number 0x0fffffff (2^28 - 1).
So instead use LBA48 for such accesses.

This bug could bite a lot of systems, especially when the user has
taken care to align partitions to 4KB boundaries. On misaligned systems,
it is less likely to be encountered, since a 4KB read would end at
0x10000000 rather than at 0x0fffffff.
Signed-off-by: NMark Lord <mlord@pobox.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

45c4d015

08 4月, 2010 1 次提交

virtio: disable multiport console support. · b7a41301

由 Michael S. Tsirkin 提交于 3月 31, 2010

Move MULTIPORT feature and related config changes
out of exported headers, and disable the feature
at runtime.

At this point, it seems less risky to keep code around
until we can enable it than rip it out completely.
Signed-off-by: NMichael S. Tsirkin <mst@redhat.com>
Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>

b7a41301

07 4月, 2010 7 次提交

memcg: fix race in file_mapped accounting · 8725d541

由 KAMEZAWA Hiroyuki 提交于 4月 06, 2010

Presently, memcg's FILE_MAPPED accounting has following race with
move_account (happens at rmdir()).

    increment page->mapcount (rmap.c)
    mem_cgroup_update_file_mapped()           move_account()
					      lock_page_cgroup()
					      check page_mapped() if
					      page_mapped(page)>1 {
						FILE_MAPPED -1 from old memcg
						FILE_MAPPED +1 to old memcg
					      }
					      .....
					      overwrite pc->mem_cgroup
					      unlock_page_cgroup()
    lock_page_cgroup()
    FILE_MAPPED + 1 to pc->mem_cgroup
    unlock_page_cgroup()

Then,
	old memcg (-1 file mapped)
	new memcg (+2 file mapped)

This happens because move_account see page_mapped() which is not guarded
by lock_page_cgroup().  This patch adds FILE_MAPPED flag to page_cgroup
and move account information based on it.  Now, all checks are synchronous
with lock_page_cgroup().
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: NBalbir Singh <balbir@in.ibm.com>
Reviewed-by: NDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Andrea Righi <arighi@develer.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8725d541

pagemap: fix pfn calculation for hugepage · 116354d1

由 Naoya Horiguchi 提交于 4月 06, 2010

When we look into pagemap using page-types with option -p, the value of
pfn for hugepages looks wrong (see below.) This is because pte was
evaluated only once for one vma although it should be updated for each
hugepage.  This patch fixes it.

  $ page-types -p 3277 -Nl -b huge
  voffset   offset  len     flags
  7f21e8a00 11e400  1       ___U___________H_G________________
  7f21e8a01 11e401  1ff     ________________TG________________
               ^^^
  7f21e8c00 11e400  1       ___U___________H_G________________
  7f21e8c01 11e401  1ff     ________________TG________________
               ^^^

One hugepage contains 1 head page and 511 tail pages in x86_64 and each
two lines represent each hugepage.  Voffset and offset mean virtual
address and physical address in the page unit, respectively.  The
different hugepages should not have the same offset value.

With this patch applied:

  $ page-types -p 3386 -Nl -b huge
  voffset   offset   len    flags
  7fec7a600 112c00   1      ___UD__________H_G________________
  7fec7a601 112c01   1ff    ________________TG________________
               ^^^
  7fec7a800 113200   1      ___UD__________H_G________________
  7fec7a801 113201   1ff    ________________TG________________
               ^^^
               OK

More info:

- This patch modifies walk_page_range()'s hugepage walker.  But the
  change only affects pagemap_read(), which is the only caller of hugepage
  callback.

- Without this patch, hugetlb_entry() callback is called per vma, that
  doesn't match the natural expectation from its name.

- With this patch, hugetlb_entry() is called per hugepte entry and the
  callback can become much simpler.
Signed-off-by: NNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: NKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: NMatt Mackall <mpm@selenic.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

116354d1

kernel.h: fix wrong usage of __ratelimit() · bb1dc0ba

由 Yong Zhang 提交于 4月 06, 2010

When __ratelimit() returns 1 this means that we can go ahead.
Signed-off-by: NYong Zhang <yong.zhang@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

bb1dc0ba

vfs: rename block_fsync() to blkdev_fsync() · b1dd3b28

由 Andrew Morton 提交于 4月 06, 2010

Requested by hch, for consistency now it is exported.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Anton Blanchard <anton@samba.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b1dd3b28

raw: fsync method is now required · 55ab3a1f

由 Anton Blanchard 提交于 4月 06, 2010

Commit 148f948b (vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode) broke
the raw driver.

We now call through generic_file_aio_write -> generic_write_sync ->
vfs_fsync_range.  vfs_fsync_range has:

        if (!fop || !fop->fsync) {
                ret = -EINVAL;
                goto out;
        }

But drivers/char/raw.c doesn't set an fsync method.

We have two options: fix it or remove the raw driver completely.  I'm
happy to do either, the fact this has been broken for so long suggests it
is rarely used.

The patch below adds an fsync method to the raw driver.  My knowledge of
the block layer is pretty sketchy so this could do with a once over.

If we instead decide to remove the raw driver, this patch might still be
useful as a backport to 2.6.33 and 2.6.32.
Signed-off-by: NAnton Blanchard <anton@samba.org>
Reviewed-by: NJan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <jens.axboe@oracle.com>
Reviewed-by: NJeff Moyer <jmoyer@redhat.com>
Tested-by: NJeff Moyer <jmoyer@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

55ab3a1f

include/linux/kfifo.h: fix INIT_KFIFO() · 530cd330

由 David Härdeman 提交于 4月 06, 2010

DECLARE_KFIFO creates a union with a struct kfifo and a buffer array with
size [size + sizeof(struct kfifo)].

INIT_KFIFO then sets the buffer pointer in struct kfifo to point to the
beginning of the buffer array which means that the first call to kfifo_in
will overwrite members of the struct kfifo.
Signed-off-by: NDavid Härdeman <david@hardeman.nu>
Acked-by: NStefani Seibold <stefani@seibold.net>
Cc: <stable@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

530cd330

bitops: remove temporary for_each_bit() · b01d0942

由 Andrew Morton 提交于 4月 06, 2010

Migration has been completed so remove this now.  There's one straggler in
linux-next's drivers/mtd/sm_ftl.c.  A patch has been sent.

Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

b01d0942

06 4月, 2010 3 次提交

libata: unlock HPA if device shrunk · 445d211b

由 Tejun Heo 提交于 4月 05, 2010

Some BIOSes don't configure HPA during boot but do so while resuming.
This causes harddrives to shrink during resume making libata detach
and reattach them.  This can be worked around by unlocking HPA if old
size equals native size.

Add ATA_DFLAG_UNLOCK_HPA so that HPA unlocking can be controlled
per-device and update ata_dev_revalidate() such that it sets
ATA_DFLAG_UNLOCK_HPA and fails with -EIO when the above condition is
detected.

This patch fixes the following bug.

  https://bugzilla.kernel.org/show_bug.cgi?id=15396Signed-off-by: NTejun Heo <tj@kernel.org>
Reported-by: NOleksandr Yermolenko <yaa.bta@gmail.com>
Signed-off-by: NJeff Garzik <jgarzik@redhat.com>

445d211b

Input: matrix_keypad - allow platform to disable key autorepeat · 9d32c305

由 H Hartley Sweeten 提交于 4月 05, 2010

In an embedded system the matrix_keypad driver might be used to
interface with an external control panel and not an actual keyboard.
On the control panel some of the keys could be used to turn on/off
various functions.  If key autorepeat is enabled this causes the
function to quickly toggle between the on and off states and makes
operation difficult.

Add an option in the platform-specific data to disable the key
autorepeat.
Signed-off-by: NH Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: NDmitry Torokhov <dtor@mail.ru>

9d32c305

Fix up possibly racy module refcounting · 5fbfb18d

由 Nick Piggin 提交于 4月 01, 2010

Module refcounting is implemented with a per-cpu counter for speed.
However there is a race when tallying the counter where a reference may
be taken by one CPU and released by another. Reference count summation
may then see the decrement without having seen the previous increment,
leading to lower than expected count. A module which never has its
actual reference drop below 1 may return a reference count of 0 due to
this race.

Module removal generally runs under stop_machine, which prevents this
race causing bugs due to removal of in-use modules. However there are
other real bugs in module.c code and driver code (module_refcount is
exported) where the callers do not run under stop_machine.

Fix this by maintaining running per-cpu counters for the number of
module refcount increments and the number of refcount decrements. The
increments are tallied after the decrements, so any decrement seen will
always have its corresponding increment counted. The final refcount is
the difference of the total increments and decrements, preventing a
low-refcount from being returned.
Signed-off-by: NNick Piggin <npiggin@suse.de>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

5fbfb18d

02 4月, 2010 1 次提交

ibft, x86: Change reserve_ibft_region() to find_ibft_region() · 042be38e

由 Yinghai Lu 提交于 4月 01, 2010

This allows arch code could decide the way to reserve the ibft.

And we should reserve ibft as early as possible, instead of BOOTMEM
stage, in case the table is in RAM range and is not reserved by BIOS
(this will often be the case.)

Move to just after find_smp_config().

Also when CONFIG_NO_BOOTMEM=y, We will not have reserve_bootmem() anymore.

-v2: fix typo about ibft pointed by Konrad Rzeszutek Wilk <konrad@darnok.org>
Signed-off-by: NYinghai Lu <yinghai@kernel.org>
LKML-Reference: <4BB510FB.80601@kernel.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Jones <pjones@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad@kernel.org>
CC: Jan Beulich <jbeulich@novell.com>
Signed-off-by: NH. Peter Anvin <hpa@zytor.com>

042be38e

01 4月, 2010 2 次提交

ide: Requeue request after DMA timeout · 6072f749

由 Herbert Xu 提交于 3月 31, 2010

I noticed that my KVM virtual machines were experiencing IDE
issues resulting in processes stuck on waiting for buffers to
complete.

The root cause is of course race conditions in the ancient qemu
backend that I'm using.  However, the fact that the guest isn't
recovering is a bug.

I've tracked it down to the change made last year to dequeue
requests at the start rather than at the end in the IDE layer.

commit 8f6205cd
Author: Tejun Heo <tj@kernel.org>
Date:   Fri May 8 11:53:59 2009 +0900

    ide: dequeue in-flight request

The problem is that the function ide_dma_timeout_retry does not
requeue the current request, causing one request to be lost for
each DMA timeout.

This patch fixes this by requeueing the request.
Signed-off-by: NHerbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

6072f749

perf: Use hot regs with software sched switch/migrate events · e49a5bd3

由 Frederic Weisbecker 提交于 3月 22, 2010

Scheduler's task migration events don't work because they always
pass NULL regs perf_sw_event(). The event hence gets filtered
in perf_swevent_add().

Scheduler's context switches events use task_pt_regs() to get
the context when the event occured which is a wrong thing to
do as this won't give us the place in the kernel where we went
to sleep but the place where we left userspace. The result is
even more wrong if we switch from a kernel thread.

Use the hot regs snapshot for both events as they belong to the
non-interrupt/exception based events family. Unlike page faults
or so that provide the regs matching the exact origin of the event,
we need to save the current context.

This makes the task migration event working and fix the context
switch callchains and origin ip.

Example: perf record -a -e cs

Before:

    10.91%      ksoftirqd/0                  0  [k] 0000000000000000
                |
                --- (nil)
                    perf_callchain
                    perf_prepare_sample
                    __perf_event_overflow
                    perf_swevent_overflow
                    perf_swevent_add
                    perf_swevent_ctx_event
                    do_perf_sw_event
                    __perf_sw_event
                    perf_event_task_sched_out
                    schedule
                    run_ksoftirqd
                    kthread
                    kernel_thread_helper

After:

    23.77%  hald-addon-stor  [kernel.kallsyms]  [k] schedule
            |
            --- schedule
               |
               |--60.00%-- schedule_timeout
               |          wait_for_common
               |          wait_for_completion
               |          blk_execute_rq
               |          scsi_execute
               |          scsi_execute_req
               |          sr_test_unit_ready
               |          |
               |          |--66.67%-- sr_media_change
               |          |          media_changed
               |          |          cdrom_media_changed
               |          |          sr_block_media_changed
               |          |          check_disk_change
               |          |          cdrom_open

v2: Always build perf_arch_fetch_caller_regs() now that software
events need that too. They don't need it from modules, unlike trace
events, so we keep the EXPORT_SYMBOL in trace_event_perf.c
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>

e49a5bd3

31 3月, 2010 1 次提交

module: add stub for is_module_percpu_address · d5e50daf

由 Randy Dunlap 提交于 3月 31, 2010

Fix build for CONFIG_MODULES not enabled by providing a stub
for is_module_percpu_address().

kernel/lockdep.c:605: error: implicit declaration of function 'is_module_percpu_address'
Signed-off-by: NRandy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: NTejun Heo <tj@kernel.org>

d5e50daf

30 3月, 2010 6 次提交

percpu: don't implicitly include slab.h from percpu.h · de380b55

由 Tejun Heo 提交于 3月 24, 2010

percpu.h has always been including slab.h to get k[mz]alloc/free() for
UP inline implementation.  percpu.h being used by very low level
headers including module.h and sched.h, this meant that a lot files
unintentionally got slab.h inclusion.

Lee Schermerhorn was trying to make topology.h use percpu.h and got
bitten by this implicit inclusion.  The right thing to do is break
this ultimately unnecessary dependency.  The previous patch added
explicit inclusion of either gfp.h or slab.h to the source files using
them.  This patch updates percpu.h such that slab.h is no longer
included from percpu.h.
Signed-off-by: NTejun Heo <tj@kernel.org>
Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

de380b55

include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6

由 Tejun Heo 提交于 3月 24, 2010

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

5a0e3ad6

ext3: fix broken handling of EXT3_STATE_NEW · de329820

由 Linus Torvalds 提交于 3月 29, 2010

In commit 9df93939 ("ext3: Use bitops to read/modify
EXT3_I(inode)->i_state") ext3 changed its internal 'i_state' variable to
use bitops for its state handling.  However, unline the same ext4
change, it didn't actually change the name of the field when it changed
the semantics of it.

As a result, an old use of 'i_state' remained in fs/ext3/ialloc.c that
initialized the field to EXT3_STATE_NEW.  And that does not work
_at_all_ when we're now working with individually named bits rather than
values that get masked.  So the code tried to mark the state to be new,
but in actual fact set the field to EXT3_STATE_JDATA.  Which makes no
sense at all, and screws up all the code that checks whether the inode
was newly allocated.

In particular, it made the xattr code unhappy, and caused various random
behavior, like apparently

	https://bugzilla.redhat.com/show_bug.cgi?id=577911

So fix the initialization, and rename the field to match ext4 so that we
don't have this happen again.

Cc: James Morris <jmorris@namei.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Daniel J Walsh <dwalsh@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

de329820

ARM: 6003/1: removing compilation warning from pl061.h · 367d6acc

由 viresh kumar 提交于 3月 29, 2010

pl061.h is using u8 type. including <linux/types.h> in pl061.h to avoid
warning.
Signed-off-by: NViresh Kumar <viresh.kumar@st.com>
Acked-by: NBaruch Siach <baruch@tkos.co.il>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

367d6acc

ARM: 5999/1: Including device.h and resource.h header files in linux/amba/bus.h · c36207a4

由 viresh kumar 提交于 3月 29, 2010

linux/amba/bus.h have dependencies on linux/device.h and linux/resource.h, but
it doesn't include them. We get compilation errors in our files which include
bus.h but doesn't include device.h and resource.h. This patch includes device.h
and resource.h in linux/amba/bus.h file.
Signed-off-by: NViresh Kumar <viresh.kumar@st.com>
Acked-by: NLinux Walleij <linux.ml.walleij@gmail.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

c36207a4

SLOW_WORK: CONFIG_SLOW_WORK_PROC should be CONFIG_SLOW_WORK_DEBUG · a53f4f9e

由 David Howells 提交于 3月 29, 2010

CONFIG_SLOW_WORK_PROC was changed to CONFIG_SLOW_WORK_DEBUG, but not in all
instances. Change the remaining instances. This makes the debugfs file
display the time mark and the owner's description again.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a53f4f9e

29 3月, 2010 2 次提交

percpu, module: implement and use is_kernel/module_percpu_address() · 10fad5e4

由 Tejun Heo 提交于 3月 10, 2010

lockdep has custom code to check whether a pointer belongs to static
percpu area which is somewhat broken.  Implement proper
is_kernel/module_percpu_address() and replace the custom code.

On UP, percpu variables are regular static variables and can't be
distinguished from them.  Always return %false on UP.
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>

10fad5e4

module: encapsulate percpu handling better and record percpu_size · 259354de

由 Tejun Heo 提交于 3月 10, 2010

Better encapsulate module static percpu area handling so that code
outsidef of CONFIG_SMP ifdef doesn't deal with mod->percpu directly
and add mod->percpu_size and record percpu_size in it.  Both percpu
fields are compiled out on UP.  While at it, mark mod->percpu w/
__percpu.

This is to prepare for is_module_percpu_address().
Signed-off-by: NTejun Heo <tj@kernel.org>
Acked-by: NRusty Russell <rusty@rustcorp.com.au>

259354de

27 3月, 2010 2 次提交

net: Add MSG_WAITFORONE flag to recvmmsg · 71c5c159

由 Brandon L Black 提交于 3月 26, 2010

Add new flag MSG_WAITFORONE for the recvmmsg() syscall.
When this flag is specified for a blocking socket, recvmmsg()
will only block until at least 1 packet is available.  The
default behavior is to block until all vlen packets are
available.  This flag has no effect on non-blocking sockets
or when used in combination with MSG_DONTWAIT.
Signed-off-by: NBrandon L Black <blblack@gmail.com>
Acked-by: NUlrich Drepper <drepper@redhat.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Acked-by: NArnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

71c5c159

Freezer: Fix buggy resume test for tasks frozen with cgroup freezer · 5a7aadfe

由 Matt Helsley 提交于 3月 26, 2010

When the cgroup freezer is used to freeze tasks we do not want to thaw
those tasks during resume. Currently we test the cgroup freezer
state of the resuming tasks to see if the cgroup is FROZEN. If so
then we don't thaw the task. However, the FREEZING state also indicates
that the task should remain frozen.

This also avoids a problem pointed out by Oren Ladaan: the freezer state
transition from FREEZING to FROZEN is updated lazily when userspace reads
or writes the freezer.state file in the cgroup filesystem. This means that
resume will thaw tasks in cgroups which should be in the FROZEN state if
there is no read/write of the freezer.state file to trigger this
transition before suspend.

NOTE: Another "simple" solution would be to always update the cgroup
freezer state during resume. However it's a bad choice for several reasons:
Updating the cgroup freezer state is somewhat expensive because it requires
walking all the tasks in the cgroup and checking if they are each frozen.
Worse, this could easily make resume run in N^2 time where N is the number
of tasks in the cgroup. Finally, updating the freezer state from this code
path requires trickier locking because of the way locks must be ordered.

Instead of updating the freezer state we rely on the fact that lazy
updates only manage the transition from FREEZING to FROZEN. We know that
a cgroup with the FREEZING state may actually be FROZEN so test for that
state too. This makes sense in the resume path even for partially-frozen
cgroups -- those that really are FREEZING but not FROZEN.
Reported-by: NOren Ladaan <orenl@cs.columbia.edu>
Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
Cc: stable@kernel.org
Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>

5a7aadfe

25 3月, 2010 1 次提交

netfilter: ip6table_raw: fix table priority · 9c138866

由 Jozsef Kadlecsik 提交于 3月 25, 2010

The order of the IPv6 raw table is currently reversed, that makes impossible
to use the NOTRACK target in IPv6: for example if someone enters

ip6tables -t raw -A PREROUTING -p tcp --dport 80 -j NOTRACK

and if we receive fragmented packets then the first fragment will be
untracked and thus skip nf_ct_frag6_gather (and conntrack), while all
subsequent fragments enter nf_ct_frag6_gather and reassembly will never
successfully be finished.
Singed-off-by: NJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: NPatrick McHardy <kaber@trash.net>

9c138866