提交 · 3bcfeaf93f44112053e1c36aa681d9efc1185ddc · openanolis / cloud-kernel

21 10月, 2011 1 次提交

block: initialize the bounce pool if high memory may be added later · 3bcfeaf9

由 David Vrabel 提交于 10月 20, 2011

init_emergency_pool() does not create the page pool for bouncing block
requests if the current count of high pages is zero.  If high memory
may be added later (either via memory hotplug or a balloon driver in a
virtualized system) then a oops occurs if a request with a high page
need bouncing because the pool does not exist.

So, always create the pool if memory hotplug is enabled and change the
test so it's valid even if all high pages are currently in the balloon
(the balloon drivers adjust totalhigh_pages but not max_pfn).
Signed-off-by: NDavid Vrabel <david.vrabel@citrix.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

3bcfeaf9

19 10月, 2011 11 次提交

block: fix request_queue lifetime handling by making blk_queue_cleanup() properly shutdown · c9a929dd

由 Tejun Heo 提交于 10月 19, 2011

request_queue is refcounted but actually depdends on lifetime
management from the queue owner - on blk_cleanup_queue(), block layer
expects that there's no request passing through request_queue and no
new one will.

This is fundamentally broken.  The queue owner (e.g. SCSI layer)
doesn't have a way to know whether there are other active users before
calling blk_cleanup_queue() and other users (e.g. bsg) don't have any
guarantee that the queue is and would stay valid while it's holding a
reference.

With delay added in blk_queue_bio() before queue_lock is grabbed, the
following oops can be easily triggered when a device is removed with
in-flight IOs.

 sd 0:0:1:0: [sdb] Stopping disk
 ata1.01: disabled
 general protection fault: 0000 [#1] PREEMPT SMP
 CPU 2
 Modules linked in:

 Pid: 648, comm: test_rawio Not tainted 3.1.0-rc3-work+ #56 Bochs Bochs
 RIP: 0010:[<ffffffff8137d651>]  [<ffffffff8137d651>] elv_rqhash_find+0x61/0x100
 ...
 Process test_rawio (pid: 648, threadinfo ffff880019efa000, task ffff880019ef8a80)
 ...
 Call Trace:
  [<ffffffff8137d774>] elv_merge+0x84/0xe0
  [<ffffffff81385b54>] blk_queue_bio+0xf4/0x400
  [<ffffffff813838ea>] generic_make_request+0xca/0x100
  [<ffffffff81383994>] submit_bio+0x74/0x100
  [<ffffffff811c53ec>] dio_bio_submit+0xbc/0xc0
  [<ffffffff811c610e>] __blockdev_direct_IO+0x92e/0xb40
  [<ffffffff811c39f7>] blkdev_direct_IO+0x57/0x60
  [<ffffffff8113b1c5>] generic_file_aio_read+0x6d5/0x760
  [<ffffffff8118c1ca>] do_sync_read+0xda/0x120
  [<ffffffff8118ce55>] vfs_read+0xc5/0x180
  [<ffffffff8118cfaa>] sys_pread64+0x9a/0xb0
  [<ffffffff81afaf6b>] system_call_fastpath+0x16/0x1b

This happens because blk_queue_cleanup() destroys the queue and
elevator whether IOs are in progress or not and DEAD tests are
sprinkled in the request processing path without proper
synchronization.

Similar problem exists for blk-throtl.  On queue cleanup, blk-throtl
is shutdown whether it has requests in it or not.  Depending on
timing, it either oopses or throttled bios are lost putting tasks
which are waiting for bio completion into eternal D state.

The way it should work is having the usual clear distinction between
shutdown and release.  Shutdown drains all currently pending requests,
marks the queue dead, and performs partial teardown of the now
unnecessary part of the queue.  Even after shutdown is complete,
reference holders are still allowed to issue requests to the queue
although they will be immmediately failed.  The rest of teardown
happens on release.

This patch makes the following changes to make blk_queue_cleanup()
behave as proper shutdown.

* QUEUE_FLAG_DEAD is now set while holding both q->exit_mutex and
  queue_lock.

* Unsynchronized DEAD check in generic_make_request_checks() removed.
  This couldn't make any meaningful difference as the queue could die
  after the check.

* blk_drain_queue() updated such that it can drain all requests and is
  now called during cleanup.

* blk_throtl updated such that it checks DEAD on grabbing queue_lock,
  drains all throttled bios during cleanup and free td when queue is
  released.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

c9a929dd

block: drop @tsk from attempt_plug_merge() and explain sync rules · bd87b589

由 Tejun Heo 提交于 10月 19, 2011

attempt_plug_merge() accesses elevator without holding queue_lock and
may call into ->elevator_bio_merge_fn().  The elvator is guaranteed to
be valid because it's accessed iff the plugged list has requests and
elevator is never exited with live requests, so as long as the
elevator method can deal with unlocked access, this is safe.

Explain the sync rules around attempt_plug_merge() and drop the
unnecessary @tsk parameter.

This patch doesn't introduce any functional change.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bd87b589

block: make get_request[_wait]() fail if queue is dead · da8303c6

由 Tejun Heo 提交于 10月 19, 2011

Currently get_request[_wait]() allocates request whether queue is dead
or not.  This patch makes get_request[_wait]() return NULL if @q is
dead.  blk_queue_bio() is updated to fail the submitted bio if request
allocation fails.  While at it, add docbook comments for
get_request[_wait]().

Note that the current code has rather unclear (there are spurious DEAD
tests scattered around) assumption that the owner of a queue
guarantees that no request travels block layer if the queue is dead
and this patch in itself doesn't change much; however, this will allow
fixing the broken assumption in the next patch.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

da8303c6

block: reorganize throtl_get_tg() and blk_throtl_bio() · bc16a4f9

由 Tejun Heo 提交于 10月 19, 2011

blk_throtl_bio() and throtl_get_tg() have rather unusual interface.

* throtl_get_tg() returns pointer to a valid tg or ERR_PTR(-ENODEV),
  and drops queue_lock in the latter case.  Different locking context
  depending on return value is error-prone and DEAD state is scheduled
  to be protected by queue_lock anyway.  Move DEAD check inside
  queue_lock and return valid tg or NULL.

* blk_throtl_bio() indicates return status both with its return value
  and in/out param **@bio.  The former is used to indicate whether
  queue is found to be dead during throtl processing.  The latter
  whether the bio is throttled.

  There's no point in returning DEAD check result from
  blk_throtl_bio().  The queue can die after blk_throtl_bio() is
  finished but before make_request_fn() grabs queue lock.

  Make it take *@bio instead and return boolean result indicating
  whether the request is throttled or not.

This patch doesn't cause any visible functional difference.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bc16a4f9

block: reorganize queue draining · e3c78ca5

由 Tejun Heo 提交于 10月 19, 2011

Reorganize queue draining related code in preparation of queue exit
changes.

* Factor out actual draining from elv_quiesce_start() to
  blk_drain_queue().

* Make elv_quiesce_start/end() responsible for their own locking.

* Replace open-coded ELVSWITCH clearing in elevator_switch() with
  elv_quiesce_end().

This patch doesn't cause any visible functional difference.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

e3c78ca5

block: drop unnecessary blk_get/put_queue() in scsi_cmd_ioctl() and blk_get_tg() · 315fceee

由 Tejun Heo 提交于 10月 19, 2011

blk_get/put_queue() in scsi_cmd_ioctl() and throtl_get_tg() are
completely bogus.  The caller must have a reference to the queue on
entry and taking an extra reference doesn't change anything.

For scsi_cmd_ioctl(), the only effect is that it ends up checking
QUEUE_FLAG_DEAD on entry; however, this is bogus as queue can die
right after blk_get_queue().  Dead queue should be and is handled in
request issue path (it's somewhat broken now but that's a separate
problem and doesn't affect this one much).

throtl_get_tg() incorrectly assumes that q is rcu freed.  Also, it
doesn't check return value of blk_get_queue().  If the queue is
already dead, it ends up doing an extra put.

Drop them.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

315fceee

block: pass around REQ_* flags instead of broken down booleans during request alloc/free · 75eb6c37

由 Tejun Heo 提交于 10月 19, 2011

blk_alloc_request() and freed_request() take different combinations of
REQ_* @flags, @priv and @is_sync when @flags is superset of the latter
two.  Make them take @flags only.  This cleans up the code a bit and
will ease updating allocation related REQ_* flags.

This patch doesn't introduce any functional difference.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

75eb6c37

block: move blk_throtl prototypes to block/blk.h · bc9fcbf9

由 Tejun Heo 提交于 10月 19, 2011

blk_throtl interface is block internal and there's no reason to have
them in linux/blkdev.h.  Move them to block/blk.h.

This patch doesn't introduce any functional change.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

bc9fcbf9

block: fix genhd refcounting in blkio_policy_parse_and_set() · ece84241

由 Tejun Heo 提交于 10月 19, 2011

blkio_policy_parse_and_set() calls blkio_check_dev_num() to check
whether the given dev_t is valid.  blkio_check_dev_num() uses
get_gendisk() for verification but never puts the returned genhd
leaking the reference.

This patch collapses blkio_check_dev_num() into its caller and updates
it such that the genhd is put before returning.
Signed-off-by: NTejun Heo <tj@kernel.org>
Signed-off-by: NJens Axboe <axboe@kernel.dk>

ece84241

block: make gendisk hold a reference to its queue · 523e1d39

由 Tejun Heo 提交于 10月 19, 2011

The following command sequence triggers an oops.

# mount /dev/sdb1 /mnt
# echo 1 > /sys/class/scsi_device/0\:0\:1\:0/device/delete
# umount /mnt

 general protection fault: 0000 [#1] PREEMPT SMP
 CPU 2
 Modules linked in:

 Pid: 791, comm: umount Not tainted 3.1.0-rc3-work+ #8 Bochs Bochs
 RIP: 0010:[<ffffffff810d0879>]  [<ffffffff810d0879>] __lock_acquire+0x389/0x1d60
...
 Call Trace:
  [<ffffffff810d2845>] lock_acquire+0x95/0x140
  [<ffffffff81aed87b>] _raw_spin_lock+0x3b/0x50
  [<ffffffff811573bc>] bdi_lock_two+0x5c/0x70
  [<ffffffff811c2f6c>] bdev_inode_switch_bdi+0x4c/0xf0
  [<ffffffff811c3fcb>] __blkdev_put+0x11b/0x1d0
  [<ffffffff811c4010>] __blkdev_put+0x160/0x1d0
  [<ffffffff811c40df>] blkdev_put+0x5f/0x190
  [<ffffffff8118f18d>] kill_block_super+0x4d/0x80
  [<ffffffff8118f4a5>] deactivate_locked_super+0x45/0x70
  [<ffffffff8119003a>] deactivate_super+0x4a/0x70
  [<ffffffff811ac4ad>] mntput_no_expire+0xed/0x130
  [<ffffffff811acf2e>] sys_umount+0x7e/0x3a0
  [<ffffffff81aeeeab>] system_call_fastpath+0x16/0x1b

This is because bdev holds on to disk but disk doesn't pin the
associated queue.  If a SCSI device is removed while the device is
still open, the sdev puts the base reference to the queue on release.
When the bdev is finally released, the associated queue is already
gone along with the bdi and bdev_inode_switch_bdi() ends up
dereferencing already freed bdi.

Even if it were not for this bug, disk not holding onto the associated
queue is very unusual and error-prone.

Fix it by making add_disk() take an extra reference to its queue and
put it on disk_release() and ensuring that disk and its fops owner are
put in that order after all accesses to the disk and queue are
complete.
Signed-off-by: NTejun Heo <tj@kernel.org>
Cc: stable@kernel.org
Signed-off-by: NJens Axboe <axboe@kernel.dk>

523e1d39

Merge branch 'v3.1-rc10' into for-3.2/core · 5c04b426

由 Jens Axboe 提交于 10月 19, 2011

Conflicts:
	block/blk-core.c
	include/linux/blkdev.h
Signed-off-by: NJens Axboe <axboe@kernel.dk>

5c04b426

18 10月, 2011 1 次提交
- L
  
  Linux 3.1-rc10 · 899e3ee4
  由 Linus Torvalds 提交于 10月 17, 2011
  
  899e3ee4
17 10月, 2011 2 次提交

Avoid using variable-length arrays in kernel/sys.c · a84a79e4

由 Linus Torvalds 提交于 10月 17, 2011

The size is always valid, but variable-length arrays generate worse code
for no good reason (unless the function happens to be inlined and the
compiler sees the length for the simple constant it is).

Also, there seems to be some code generation problem on POWER, where
Henrik Bakken reports that register r28 can get corrupted under some
subtle circumstances (interrupt happening at the wrong time?).  That all
indicates some seriously broken compiler issues, but since variable
length arrays are bad regardless, there's little point in trying to
chase it down.

"Just don't do that, then".
Reported-by: NHenrik Grindal Bakken <henribak@cisco.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a84a79e4

Merge branch 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm · 8bc03e8f

由 Linus Torvalds 提交于 10月 16, 2011

* 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
  ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS
  ARM: 7122/1: localtimer: add header linux/errno.h explicitly
  ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9
  ARM: 7113/1: mm: Align bank start to MAX_ORDER_NR_PAGES

8bc03e8f

15 10月, 2011 4 次提交

ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS · f8be12d1

由 Zoltan Devai 提交于 10月 10, 2011

This is unneeded and causes an abort on the SPMP8000 platform.
Acked-by: NLinus Walleij <linus.walleij@linaro.org>
Signed-off-by: NZoltan Devai <zoss@devai.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

f8be12d1

ARM: 7122/1: localtimer: add header linux/errno.h explicitly · bb1ac3ec

由 Shawn Guo 提交于 10月 06, 2011

Per the text in  Documentation/SubmitChecklist as below, we should
explicitly have header linux/errno.h in localtimer.h for ENXIO
reference.

1: If you use a facility then #include the file that defines/declares
   that facility.  Don't depend on other header files pulling in ones
   that you use.

Otherwise, we may run into some compiling error like the following one,
if any file includes localtimer.h without CONFIG_LOCAL_TIMERS defined.

  arch/arm/include/asm/localtimer.h: In function ‘local_timer_setup’:
  arch/arm/include/asm/localtimer.h:53:10: error: ‘ENXIO’ undeclared (first use in this function)
Signed-off-by: NShawn Guo <shawn.guo@linaro.org>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

bb1ac3ec

ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9 · 29a541f6

由 Will Deacon 提交于 10月 03, 2011

Using COHERENT_LINE_{MISS,HIT} for cache misses and references
respectively is completely wrong. Instead, use the L1D events which
are a better and more useful approximation despite ignoring instruction
traffic.
Reported-by: NAlasdair Grant <alasdair.grant@arm.com>
Reported-by: NMatt Horsnell <matt.horsnell@arm.com>
Reported-by: NMichael Williams <michael.williams@arm.com>
Cc: stable@kernel.org
Cc: Jean Pihet <j-pihet@ti.com>
Signed-off-by: NWill Deacon <will.deacon@arm.com>
Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>

29a541f6

Merge branch 'hwmon-for-linus' of... · 4c41042d

由 Linus Torvalds 提交于 10月 15, 2011

Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (w83627ehf) Properly report thermal diode sensors

4c41042d

14 10月, 2011 8 次提交

Merge branch 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6 · e9308cfd

由 Linus Torvalds 提交于 10月 14, 2011

* 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6:
  gpio-pca953x: fix gpio_base
  gpio/omap: fix build error with certain OMAP1 configs

e9308cfd

Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · 48008296

由 Linus Torvalds 提交于 10月 14, 2011

* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: revert to using a kthread for AIL pushing
  xfs: force the log if we encounter pinned buffers in .iop_pushbuf
  xfs: do not update xa_last_pushed_lsn for locked items

48008296

Merge branch 'stable' of git://github.com/cmetcalf-tilera/linux-tile · 95bc156c

由 Linus Torvalds 提交于 10月 14, 2011

* 'stable' of git://github.com/cmetcalf-tilera/linux-tile:
  tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files

95bc156c

L
Merge branch 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip · 2ad53110
由 Linus Torvalds 提交于 10月 14, 2011
```
* 'x86-urgent-for-linus' of git://tesla.tglx.de/git/linux-2.6-tip:
  x86: Default to vsyscall=native for now
```
2ad53110

x86, mrst: use a temporary variable for SFI irq · 153b19a3

由 Mika Westerberg 提交于 10月 13, 2011

SFI tables reside in RAM and should not be modified once they are
written.  Current code went to set pentry->irq to zero which causes
subsequent reads to fail with invalid SFI table checksum.  This will
break kexec as the second kernel fails to validate SFI tables.

To fix this we use temporary variable for irq number.
Signed-off-by: NMika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: NKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

153b19a3

hwmon: (w83627ehf) Properly report thermal diode sensors · bf164c58

由 Jean Delvare 提交于 10月 13, 2011

The w83627ehf driver is improperly reporting thermal diode sensors as
type 2, instead of 3. This caused "sensors" and possibly other
monitoring tools to report these sensors as "transistor" instead of
"thermal diode".

Furthermore, diode subtype selection (CPU vs. external) is only
supported by the original W83627EHF/EHG. All later models only support
CPU diode type, and some (NCT6776F) don't even have the register in
question so we should avoid reading from it.
Signed-off-by: NJean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
Signed-off-by: NGuenter Roeck <guenter.roeck@ericsson.com>

bf164c58

gpio-pca953x: fix gpio_base · 25fcf2b7

由 Hartmut Knaack 提交于 10月 11, 2011

gpio_base was set to 0 if no system platform data or open firmware
platform data was provided. This led to conflicts, if any other gpiochip
with a gpiobase of 0 was instantiated already. Setting it to -1 will
automatically use the first one available.
Signed-off-by: NHartmut Knaack <knaack.h@gmx.de>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

25fcf2b7

gpio/omap: fix build error with certain OMAP1 configs · 78a43158

由 Janusz Krzysztofik 提交于 8月 23, 2011

With commit f64ad1a0, "gpio/omap: cleanup _set_gpio_wakeup(), remove
ifdefs", access to build time conditionally omitted 'suspend_wakeup'
member of the 'gpio_bank' structure has been placed unconditionally in
function _set_gpio_wakeup(), which is always built. This resulted in the
driver compilation broken for certain OMAP1, i.e., non-OMAP16xx,
configurations.

Really required or not in previously excluded cases, define this
structure member unconditionally as a fix.

Tested with a custom OMAP1510 only configuration.
Signed-off-by: NJanusz Krzysztofik <jkrzyszt@tis.icnet.pl>
Acked-by: NKevin Hilman <khilman@ti.com>
Tested-by: NAaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: NGrant Likely <grant.likely@secretlab.ca>

78a43158

13 10月, 2011 4 次提交

tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files · d52104b2

由 Chris Metcalf 提交于 10月 05, 2011

The 32-bit TILEPro support uses some #defines in <asm/atomic_32.h>
for atomic support routines in assembly.  To make this more explicit,
I've turned those includes into includes of <asm/atomic_32.h>, which
should hopefully make it clear that they shouldn't be bombed into
<linux/atomic.h> in any cleanups.
Signed-off-by: NChris Metcalf <cmetcalf@tilera.com>

d52104b2

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 37cf9516

由 Linus Torvalds 提交于 10月 13, 2011

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  mscan: too much data copied to CAN frame due to 16 bit accesses
  gro: refetch inet6_protos[] after pulling ext headers
  bnx2x: fix cl_id allocation for non-eth clients for NPAR mode
  mlx4_en: fix endianness with blue frame support

37cf9516

ide: Fix file references in drivers/ide/ · 1d113601

由 Johann Felix Soden 提交于 10月 10, 2011

Fix file references in drivers/ide/

There are a lot of file references to now moved or deleted files in the
whole tree, especially in documentation and Kconfig files.  This patch
fixes the references in drivers/ide/.
Signed-off-by: NJohann Felix Soden <johfel@users.sourceforge.net>
Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
Acked-by: NDavid S. Miller <davem@davemloft.net>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

1d113601

Merge branch 'btrfs-3.0' of git://github.com/chrismason/linux · b2f9452b

由 Linus Torvalds 提交于 10月 13, 2011

* 'btrfs-3.0' of git://github.com/chrismason/linux:
  Btrfs: make sure not to defrag extents past i_size
  Btrfs: fix recursive auto-defrag

b2f9452b

12 10月, 2011 3 次提交

xfs: revert to using a kthread for AIL pushing · 0030807c

由 Christoph Hellwig 提交于 10月 11, 2011

Currently we have a few issues with the way the workqueue code is used to
implement AIL pushing:

 - it accidentally uses the same workqueue as the syncer action, and thus
   can be prevented from running if there are enough sync actions active
   in the system.
 - it doesn't use the HIGHPRI flag to queue at the head of the queue of
   work items

At this point I'm not confident enough in getting all the workqueue flags and
tweaks right to provide a perfectly reliable execution context for AIL
pushing, which is the most important piece in XFS to make forward progress
when the log fills.

Revert back to use a kthread per filesystem which fixes all the above issues
at the cost of having a task struct and stack around for each mounted
filesystem.  In addition this also gives us much better ways to diagnose
any issues involving hung AIL pushing and removes a small amount of code.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NStefan Priebe <s.priebe@profihost.ag>
Tested-by: NStefan Priebe <s.priebe@profihost.ag>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

0030807c

xfs: force the log if we encounter pinned buffers in .iop_pushbuf · 17b38471

由 Christoph Hellwig 提交于 10月 11, 2011

We need to check for pinned buffers even in .iop_pushbuf given that inode
items flush into the same buffers that may be pinned directly due operations
on the unlinked inode list operating directly on buffers.  To do this add a
return value to .iop_pushbuf that tells the AIL push about this and use
the existing log force mechanisms to unpin it.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NStefan Priebe <s.priebe@profihost.ag>
Tested-by: NStefan Priebe <s.priebe@profihost.ag>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

17b38471

xfs: do not update xa_last_pushed_lsn for locked items · bc6e588a

由 Christoph Hellwig 提交于 10月 11, 2011

If an item was locked we should not update xa_last_pushed_lsn and thus skip
it when restarting the AIL scan as we need to be able to lock and write it
out as soon as possible.  Otherwise heavy lock contention might starve AIL
pushing too easily, especially given the larger backoff once we moved
xa_last_pushed_lsn all the way to the target lsn.
Signed-off-by: NChristoph Hellwig <hch@lst.de>
Reported-by: NStefan Priebe <s.priebe@profihost.ag>
Tested-by: NStefan Priebe <s.priebe@profihost.ag>
Reviewed-by: NDave Chinner <dchinner@redhat.com>
Signed-off-by: NAlex Elder <aelder@sgi.com>

bc6e588a

11 10月, 2011 6 次提交

Btrfs: make sure not to defrag extents past i_size · f7f43cc8

由 Chris Mason 提交于 10月 11, 2011

The btrfs file defrag code will loop through the extents and
force COW on them.  But there is a concurrent truncate in the middle of
the defrag, it might end up defragging the same range over and over
again.

The problem is that writepage won't go through and do anything on pages
past i_size, so the cow won't happen, so the file will appear to still
be fragmented.  defrag will end up hitting the same extents again and
again.

In the worst case, the truncate can actually live lock with the defrag
because the defrag keeps creating new ordered extents which the truncate
code keeps waiting on.

The fix here is to make defrag check for i_size inside the main loop,
instead of just once before the looping starts.
Signed-off-by: NChris Mason <chris.mason@oracle.com>

f7f43cc8

x86: Default to vsyscall=native for now · 2b666859

由 Adrian Bunk 提交于 10月 06, 2011

This UML breakage:

linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790
linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790

Is caused by commit 3ae36655 ("x86-64: Rework vsyscall emulation and add
vsyscall= parameter") - the vsyscall emulation code is not fully cooked
yet as UML relies on some rather fragile SIGSEGV semantics.

Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default
to vsyscall=native for now, this patch implements that.
Signed-off-by: NAdrian Bunk <bunk@kernel.org>
Acked-by: NAndrew Lutomirski <luto@mit.edu>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/20111005214047.GE14406@localhost.pp.htv.fiSigned-off-by: NIngo Molnar <mingo@elte.hu>

2b666859

Btrfs: fix recursive auto-defrag · 2a0f7f57

由 Li Zefan 提交于 10月 10, 2011

Follow those steps:

  # mount -o autodefrag /dev/sda7 /mnt
  # dd if=/dev/urandom of=/mnt/tmp bs=200K count=1
  # sync
  # dd if=/dev/urandom of=/mnt/tmp bs=8K count=1 conv=notrunc

and then it'll go into a loop: writeback -> defrag -> writeback ...

It's because writeback writes [8K, 200K] and then writes [0, 8K].

I tried to make writeback know if the pages are dirtied by defrag,
but the patch was a bit intrusive. Here I simply set writeback_index
when we defrag a file.
Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
Signed-off-by: NChris Mason <chris.mason@oracle.com>

2a0f7f57

mscan: too much data copied to CAN frame due to 16 bit accesses · a3a4bfde

由 Wolfgang Grandegger 提交于 10月 07, 2011

Due to the 16 bit access to mscan registers there's too much data copied to
the zero initialized CAN frame when having an odd number of bytes to copy.
This patch ensures that only the requested bytes are copied by using an
8 bit access for the remaining byte.
Reported-by: NAndre Naujoks <nautsch@gmail.com>
Signed-off-by: NOliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: NWolfgang Grandegger <wg@grandegger.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

a3a4bfde

gro: refetch inet6_protos[] after pulling ext headers · cdaf5570

由 Yan, Zheng 提交于 10月 08, 2011

ipv6_gro_receive() doesn't update the protocol ops after pulling
the ext headers. It looks like a typo.
Signed-off-by: NZheng Yan <zheng.z.yan@intel.com>
Acked-by: NEric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

cdaf5570

bnx2x: fix cl_id allocation for non-eth clients for NPAR mode · 134d0f97

由 Dmitry Kravkov 提交于 10月 09, 2011

There are some consolidations of NPAR configuration
when FCoE and iSCSI L2 clients will get the same id,
in this case FCoE ring will be non-functional.
Signed-off-by: NDmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: NEilon Greenstein <eilong@broadcom.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

134d0f97

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功